UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Biological classification of clinical breast cancer using tissue microarrays Cheang, Maggie Chon U 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2008_fall_cheang_maggiecu.pdf [ 16.12MB ]
Metadata
JSON: 24-1.0066655.json
JSON-LD: 24-1.0066655-ld.json
RDF/XML (Pretty): 24-1.0066655-rdf.xml
RDF/JSON: 24-1.0066655-rdf.json
Turtle: 24-1.0066655-turtle.txt
N-Triples: 24-1.0066655-rdf-ntriples.txt
Original Record: 24-1.0066655-source.json
Full Text
24-1.0066655-fulltext.txt
Citation
24-1.0066655.ris

Full Text

 BIOLOGICAL CLASSIFICATION OF CLINICAL BREAST CANCER USING TISSUE MICROARRAYS   by  MAGGIE CHON U CHEANG  B.Sc., The University of British Columbia, 1998 M.MedSc., The University of Hong Kong, 2000    A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY   in   THE FACULTY OF GRADUATE STUDIES  (Pathology)     THE UNIVERSITY OF BRITISH COLUMBIA  (Vancouver)    September 2008   ! Maggie Chon U Cheang, 2008  ii ABSTRACT   Gene expression profiles have identified five major molecular breast cancer subtypes (Luminal A, Luminal B, Basal-like, HER2+/estrogen receptor! , and Normal Breast-like) that show significant differences in survival. The cost and complexity of gene expression technology has impeded its clinical implementation. By comparison, immunohistochemistry is an economical technique applicable to the standard formalin- fixed, paraffin-embedded material commonly used in hospital labs, and has the advantage of simultaneously interpretation with histomorphology.  In this thesis, I hypothesize that a surrogate panel of immunohistochemical biomarkers can be developed to discriminate the breast cancer biological subtypes. The main study cohort consists of over 4000 primary invasive breast tumors, assembled into tissue microarrays. These patients were referred to the British Columbia Cancer Agency between 1986-1992 and have staging, pathology, treatment and follow-up information. In summary, our results demonstrate that (1) the rabbit monoclonal antibody, SP1, is an improved standard for immunohistochemiscal estrogen receptor assessment in breast cancer; (2) the transcription factor, GATA-3, is almost exclusively expressed among estrogen receptor positive tumors but does not seem to predict for tamoxifen response among estrogen receptor positive patients; (3) the proliferation marker, Ki-67, together with HER2 can segregate Luminal A from Luminal B subtypes, which carry distinct risks for breast cancer relapse and death; and (4) the inclusion of the basal markers EGFR and ck5/6 to “triple negative” breast cancers provides a more specific definition of basal-like breast cancer that better predicts patient survival.  iii  These results consistently demonstrate that an immunopanel of six biomarkers (estrogen receptor, progesterone receptor, HER2, Ki-67, epidermal growth factor receptor and cytokeratin 5/6)  can be readily applied to standard pathology specimens to subtype breast cancer samples based on their underlying molecular biology. These findings have been considered sufficient to justify application of this panel onto NCIC (MA5, MA12) and CALGB (9341 and 9741) clinical trials specimens. This followup work which is underway and  will determine if the six marker immunopanel can guide decisions about which patients need aggressive systemic drug treatment, and thereby ensure patients get the most effective, individualized adjuvant systemic therapy for their breast tumor.  iv TABLE OF CONTENTS    ABSTRACT....................................................................................................................ii TABLE OF CONTENTS ............................................................................................... iv LIST OF TABLES .......................................................................................................viii LIST OF FIGURES.........................................................................................................x TERMS AND DEFINITIONS........................................................................................xi ACKNOWLEDGEMENTS..........................................................................................xiii CO-AUTHORSHIP STATEMENT..............................................................................xiv   CHATPER 1: INTRODUCTION .................................................................................1 RISK FACTORS.............................................................................................................1 DIAGNOSIS...................................................................................................................2 PATHOLOGY ................................................................................................................3 PROGNOSTIC AND PREDICTIVE MARKERS ...........................................................7 TREATMENT OF BREAST CANCER ..........................................................................8         Surgery....................................................................................................................9         Radiotherapy ...........................................................................................................9         Adjuvant Systemic Therapy................................................................................... 10 THESIS OBJECTIVES................................................................................................. 12 References..................................................................................................................... 15  CHAPTER 2: GENE EXPRESSION PROFILING OF BREAST CANCER .......... 18 CONNECTING TEXT.................................................................................................. 18 GENE EXPRESSION PROFILING OF BREAST CANCER........................................ 19 EXPRESSION PROFILING METHODOLOGY .......................................................... 20      Microarray Platforms Used in Breast Cancer Studies................................................ 20      Data Processing........................................................................................................ 23      Presentation of Primary Data .................................................................................... 24      Statistical Methods ................................................................................................... 26 BREAST CANCER EXPRESSION PROFILES ........................................................... 29      Studies to Understand Breast Tumor Biology ........................................................... 29      Studies to Improve the Diagnosis of Breast Cancer................................................... 33      Studies to Identify Prognostic Signatures.................................................................. 36      Studies to Predict Tumor Response to Adjuvant Systemic Therapy........................... 40      Studies to Identify Therapeutic Targets..................................................................... 46 CHALLENGES AND FUTURE DIRECTIONS ........................................................... 48      Potential Limitations of Breast Cancer Profiling Studies........................................... 48      Contributions of Noncancerous Elements to Breast Cancer Expression Profiles........ 50      Gene Predictors as Diagnostic Test for Clinical Use ................................................. 53 References..................................................................................................................... 61    v CHAPTER 3: IMMUNOHISTOCHEMICAL DETECTION USING THE RABBIT                         MONOCLONAL ANTIBODY SP1 OF ESTROGEN RECEPTOR IN                         BREAST CANCER IS SUPERIOR TO MOUSE MONOCLONAL                         ANTIBODY 1D5 IN PREDICTING SURVIVAL .............................. 71 CONNECTING TEXT.................................................................................................. 71 INTRODUCTION......................................................................................................... 73 MATERIALS AND METHODS................................................................................... 73      Study Population ...................................................................................................... 73      Tissue Microarrays and Immunohistochemistry........................................................ 74      ER Scoring System................................................................................................... 75      Statistical Analysis ................................................................................................... 76 RESULTS..................................................................................................................... 77      Comparison of ER Expression by SP1 and 1D5 with DCC and      Clinicopathological Characteristics........................................................................... 77      Prognostic Values of SP1, 1D5 and DCC Positive Expression .................................. 79      Concordant and Discordant Cases between SP1 and 1D5.......................................... 80      Multivariable Analysis.............................................................................................. 81 DISCUSSION............................................................................................................... 82 References................................................................................................................... 100   CHAPTER 4: GATA3 EXPRESSION IN BREAST CANCER HAS A STRONG                        ASSOCIATION WITH ESTROGEN RECEPTOR BUT LACKS                        INDEPENDENT PROGNOSTIC VALUE ........................................ 103 CONNECTING TEXT................................................................................................ 103 INTRODUCTION....................................................................................................... 104 MATERIALS AND METHODS................................................................................. 105      Study Population .................................................................................................... 106      Tissue Microarrays and Immunohistochemistry...................................................... 106      Statistical Analysis ................................................................................................. 108      Missing GATA-3 Data ........................................................................................... 109 RESULTS................................................................................................................... 110      Patient Demographics and Pathological Data.......................................................... 110      GATA-3 Immunostaining....................................................................................... 111      Association with Known Pathological Factors ........................................................ 111      Univariable Survival Analysis ................................................................................ 112      Multivariable Survival Analysis.............................................................................. 113 DISCUSSSION........................................................................................................... 114 CONCLUSION........................................................................................................... 118 References................................................................................................................... 131    vi CHAPTER 5: KI-67 AND HER2 IDENTIFY HORMONE RECEPTOR                          POSITIVE LUMINAL B BREAST CANCERS WITH POOR                          PROGNOSIS..................................................................................... 133 CONNECTING TEXT................................................................................................ 133 INTRODUCTION....................................................................................................... 135 MATERIALS AND METHODS................................................................................. 137      Quantitative RT-PCR gene expression profiling to define      breast cancer subtypes ............................................................................................ 137      Tissue Microarrays ................................................................................................. 138      Immunohistochemistry and Fluorescent In-situ Hybridization (FISH)..................... 139      Immunopanel to define Luminal A and Luminal B subtypes................................... 140      Statistical Analysis ................................................................................................. 141 RESULTS................................................................................................................... 142      Determination of optimal cutoff value for Ki-67 index to define      Luminal B tumors................................................................................................... 142      Predicting survival among hormone receptor positive breast cancers using a      surrogate immunopanel of ER, PgR, Her2 and Ki-67.............................................. 143      Clinical “low-risk” patients receiving no AST ........................................................ 144      Adjuvant tamoxifen subset ..................................................................................... 145      Combined tamoxifen and chemotherapy adjuvant subset ........................................ 145 DISCUSSION............................................................................................................. 146 References................................................................................................................... 173   CHAPTER 6: BASAL-LIKE BREAST CANCER DEFINED BY FIVE                          BIOMARKERS HAS SUPERIOR PROGNOSTIC VALUE                          THAN TRIPLE-NEGATIVE PHENOTYPE................................... 177 CONNECTING TEXT................................................................................................ 177 INTRODUCTION....................................................................................................... 178 MATERIALS AND METHODS................................................................................. 180      Patients and Tissue Microarrays ............................................................................. 180      Immunohistochemistry and Fluorescent In-Situ Hybridization (FISH).................... 181      Definition of Breast Cancer Biological Subtypes by Immunohistochemistry........... 182      Statistical Analysis ................................................................................................. 184 RESULTS................................................................................................................... 185      Clinicopathologic Characteristics of Breast Cancer Subtypes.................................. 186      Breast Cancer Specific Survival by Immunohistochemical Subtype........................ 187      Prognostic Values of Breast Cancer Subtypes Within Treatment Subsets................ 188 DISCUSSION............................................................................................................. 189 References................................................................................................................... 208   CHAPTER 7: GENERAL DISCUSSION AND CONCLUSION............................ 211 References................................................................................................................... 221    vii APPENDIX................................................................................................................ 222 ETHICAL APPROVAL.............................................................................................. 229      viii LIST OF TABLES   Table 1.1                              List of selected database with publicly                                              available breast cancer microarray data.............................. 56 Table 1.2                              Upregulated genes in a cancer expression profile............... 57 Table 3.1                              Frequencies of SP1 and 1D5.............................................. 86 Table 3.2                              Comparison of SP1 and 1D5 against DCC......................... 86 Table 3.3                              Cox model of SP1, 1D5 and DCC among patients treated                                               with adjuvant tamoxifen only............................................ 87 Supplemental Table 3.1     Frequencies of SP1 and 1D5 using TMA and                                              Whole Sections.................................................................. 89 Supplemental Table 3.2     Patients’ characteristics...................................................... 90 Supplemental Table 3.3     Survival Probabilities of SP1, 1D5 and DCC ..................... 91 Supplemental Table 3.4     Cox Model in whole cohort................................................ 93 Table 4.1                              Summary of Clinicopathological Variables...................... 119 Table 4.2                              Univariable Hazard Ratios for Standard                                               Clinicopathological Variables ......................................... 120 Table 4.3                              Cox PH model of GATA-3.............................................. 122 Table 5.1                              Patients’ Characteristics .................................................. 152 Table 5.2                              Cox Model of Breast Cancer Subtypes, No AST.............. 153 Table 5.3                              Cox Model of Breast Cancer Subtypes,                                               Adjuvant tamoxifen......................................................... 154 Table 5.4                              Cox Model of Breast Cancer Subtypes,                                               Adjuvant tamoxifen and chemotherapy ........................... 155 Supplemental Table 5.1     Clinicopathological Characteristics of qRT-PCR                                              Breast Tumors.................................................................. 156 Supplemental Table 5.2     Patients Characteristics of BCCA Series .......................... 157 Supplemental Table 5.3     Cox Model of Breast Cancer Subtypes,                                              Node Negative, Adjuvant Tamoxifen............................... 158 Supplemental Table 5.4     Cox Model of Breast Cancer Subtypes,                                              Node Positive, Adjuvant Tamoxifen................................. 159 Supplemental Table 5.5     Clinicopathological Characteristics Among “High-risk”                                              Hormone Receptor Positive Tumors................................. 160 Table 6.1                             Clinicopathological Characteristics of Whole Cohort ....... 194 Table 6.2                             Clinicopathological Characteristics Among                                              Breast Cancer Subtypes.................................................... 197 Table 6.3                             Cox Model of Breast Cancer Subtypes ............................. 198 Table 6.4                             Cox Model of Basal-like Tumors...................................... 199 Supplemental Table 6.1    Immunopanel of Breast Cancer Subtypes.......................... 201 Supplemental Table 6.2    Tumor Characteristics of Basal-like Tumors in                                             Chemotherapy Treated Cohort .......................................... 202 Appendix Table A.1          Cox Model of alpha-Basic-Crystallin in                                             Basal-like Tumors............................................................. 223 Appendix Table A.2          Cox Model of Regional Recurrences of                                              Breast Cancer Subtypes.................................................... 224  ix Appendix Table A.3           Cox Model of Local Recurrences of Breast Cancer                                              Subtypes .......................................................................... 226               x LISTS OF FIGURES   FIGURE 1.1                        Publicly accessible primary microarray data ...................... 58 FIGURE 1.2                        Expression Profiling by qRT-PCR..................................... 59 FIGURE 1.3                        Immunopanel of Breast Cancer Subtypes........................... 60 Figure 3.1                            Representative TMA Images of SP1 and 1D5 .................... 95 Figure 3.2                            BCSS analysis of SP1, 1D5 and DCC................................ 96 Figure 3.3                            BCSS analysis of SP1 and 1D5 concordant and                                              discordant cases ................................................................. 97 Supplemental Figure 3.1   RFS analysis of SP1 and 1D5 concordant and discordant                                              cases .................................................................................. 98 Supplemental Figure 3.2   BCSS analysis of SP1, 1D5 and DCC among                                              patients receiving no AST .................................................. 98 Figure 4.1                            GATA-3 Immunostains.................................................... 126 Figure 4.2                            Survival Analysis of GATA-3 in Whole Cohort............... 127 Figure 4.3                            Survival Analysis of GATA-3 in ER Positive                                              Breast Tumors.................................................................. 129 Figure 5.1                            ROC Curve of Ki-67 Index .............................................. 161 Figure 5.2                            Survival Analysis of Breast Cancer Subtypes Among                                              Node Negative, Hormone Receptor Positive                                              Tumors, No AST.............................................................. 162 Figure 5.3                            Survival Analysis of Breast Cancer Subtypes Among                                              Hormone Receptor Positive Tumors,                                              Adjuvant Tamoxifen ........................................................ 164 Figure 5.4                            Survival Analysis of Breast Cancer Subtypes Among                                              Hormone Receptor Positive Tumors,                                              Adjuvant Tamoxifen and Chemotherapy.......................... 166 Supplemental Figure 5.1   ROC Curve of Ki-67 Index .............................................. 168 Supplemental Figure 5.2   Survival Analysis of Breast Cancer Subtypes Among                                              Patients with Node Negative, Hormone Receptor Positive                                              Tumors, Adjuvant Tamoxifen .......................................... 169 Supplemental Figure 5.3   Survival Analysis of Breast Cancer Subtypes Among                                              Patients with Node Positive, Hormone Receptor Positive                                              Tumors, Adjuvant Tamoxifen .......................................... 171 Figure 6.1                            BCSS of Breast Cancer Subtypes..................................... 203 Figure 6.2                            BCSS of Basal-like Tumors Receiving Adjuvant AC or                                              FAC Chemotherapy ......................................................... 205 Supplemental Figure 6.1   BCSS of Basal-like Tumors, Adjuvant                                              Treatment Subsets............................................................ 206 Appendix Figure A.1         BCSS of alpha-Basic-Crystallin Among                                              Basal-like Tumors............................................................ 228     xi TERMS AND DEFINITIONS  Ab: antibody ASCO: American Society of Clinical Oncology Adjuvant systemic therapy (AST): drugs given following cancer surgery in an attempt to inhibit growth of residual or metastatic disease aCGH: array-based comparative genomic hybridization AC: doxorubicin and cyclophosphamide BCCA: British Columbia Cancer Agency BCSS: breast cancer specific survival Bootstrapping: a tool to assess statistical accuracy and model overfitting by resampling the data with replacement CALGB: Cancer and Leukemia Group B CK5/6: cytokertain 5/6 Cluster analysis: mathematical means of determining relationships among genes or among tumors based on a large matrix of gene expression measurements CMF: cyclophosphamide, methotrexate, fluorouracil Complete pathological response: tumor excised after neoadjuvant therapy that has no viable cancer cells visible when examined under the microscope Cox PH model: Cox proportional hazard model DCC: dextran-coated charcoal Dendrogram tree: a graphical display of the relationship of data elements as a series of binary tree branches DRFS: distant relapse-free survival ER: estrogen receptor EGFR: epidermal growth factor receptor External validation set: an independent data set used to validate hypotheses initially derived from another patient population FAC: fluorouracil, doxorubicin and cyclophosphamide FISH: Fluorescent In-situ Hybrdization Gene expression profiling: measurement of mRNA levels from thousands of genes in a biological sample  xii Heatmap: visual display of clustered gene expression data, where color represents degree of gene expression HER2: human epidermal growth factor receptor 2 oncoprotein, product of ERBB2 gene Hierarchical clustering: a method to group data objects into clusters based on their similarities (e.g., in gene expression) HR: hazard ratio IHC: immunohistochemistry Intrinsic genes: genes showing large variations between tumor samples but not paired samples from the same tumor (favors biologic over technical variability) LVI: lymphovascular invasion NCIC: National Cancer Institute of Canada NSABP: National Surgical Adjuvant Breast and Bowel Project Overfitting: a statistical model too closely optimized to the data set from which it was derived that therefore extrapolates poorly OS: overall survival PR: progesterone receptor qRT-PCR: quantitative real-time polymerase chain reaction RFS: relapse-free survival ROC: receiver operating curve TMA: tissue microarray Training sets: a data set used for hypothesis generation and statistical model building, often used for initial identification of gene signatures    xiii ACKNOWLEDGEMENT   I would like to express my deepest gratitude to my supervisor, Dr Torsten O. Nielsen, for his constant support and encouragement throughout my training as his graduate student. I am really grateful to have the opportunity to learn from such a great teacher. I would also like to thank my mentors, Drs Chris Bajdik, David Huntsman and Stephen Chia for their guidance. I have had a wonderful time in Genetic Pathology Evaluation Centre, mostly because of the great laboratory members, with special thanks to Dr C Blake Gilks, Lindsay Brown, May He, Leah Prentice, Melinda Miller, Erika Yorida, Samuel Leung and David Voduc. Finally, I would like to thank my family for bringing me up in such a loving and stress-free environment.   xiv CO-AUTHORSHIP STATEMENTS   (i) Chapter 1 -- Gene Expression Profiling of Breast Cancer  I did the literature search and was responsible for reviewing the references. I put Table 1, Figures 2 and 3 together. I wrote the entire manuscript and made all the revisions for the published version. Torsten O. Nielsen provided supervision and edited all drafts of the manuscript.  (ii) Chapter 3 -- Immunohistochemical detection using the new rabbit monoclonal antibody SP1 of estrogen receptor in breast cancer is superior to mouse monoclonal antibody 1D5 in predicting survival  I designed the study approach, together with Torsten O. Nielsen and Allen Gown. Diana Treaba and Allen Gown interpreted all the immunohistochemical data. I took full responsibility of managing the data and linking the immunohistochemical data to the clinical database. I was responsible for all of the data analyses, including both the design/ideas and the actual statistical computing work. I wrote the entire manuscript. Journal reviewers requested several changes, additions and clarifications, and additional data which I had dealt with together with Torsten O. Nielsen. He provided the supervision and edited all the revisions of the manuscript. Allen Gown provided additional data requested by the journal reviewers. This was a collaborative study between Genetic Pathology Evaluation Centre and the British Columbia Cancer Agency. All the co-authors had contributed to the collection of outcome data, funding of the project and design of the study in different levels according to their areas of expertise.   xv (iii) Chapter 4 -- GATA-3 expression in breast cancer has a strong association with estrogen receptor but lacks independent prognostic value  Overall, I contributed 45% of the design of this study. I carried out 100% of the survival analyses of GATA3 in the series of 438 invasive breast tumors. Using these results I put forward a proposal to the Breast Cancer Outcomes Unit and gained approval to validate this biomarker on the 4000 invasive breast tumour series. Torsten O. Nielsen interpreted all the immunohistochemical data, although I assisted in data entry, quality control and database management. Dave Voduc and I were responsible for 90% and 10% of the data analyses respectively in the series of 4000 invasive breast tumors, which filled the remainder of this paper. I initially contributed 50% of the statistical analyses for the first version of the manuscript, but had been revised for subsequent revisions. I assisted in manuscript organization and edited the drafts.  (iv) Chapter 5 -- Ki-67 and HER2 identify hormone receptor positive Luminal B breast cancers with poor prognosis  I designed the whole study approach, together with Torsten O. Nielsen. He interpreted all the immunohistochemical data. I assisted in data entry, quality control and database management. I was responsible for all of the data analyses (except the centroids assignments using qPCR data) which included the design, statistical approaches as well as the actual statistical computing work. I wrote the entire manuscript and subsequent drafts. Torsten O. Nielsen provided supervision of the whole project and edited all the drafts. All the co-authors had contributed to the design, funding of the project, qRT-PCR  xvi profiling of the tumor tissues and editing of the drafts in different levels according to their areas of expertise.  (v) Chapter 6 -- Basal-like breast cancer defined by five biomarkers has superior prognostic value than triple-negative phenotype  I designed the study together with Torsten O. Nielsen. He interpreted all the immunohistochemical data. I assisted in data entry, quality control and database management. I was responsible for all of the data analyses, including the design, statistical approaches as well as the actual statistical computing work. I wrote the entire manuscript and handled the subsequent revisions to address the changes, clarifications requests from the journal reviewers. Torsten O. Nielsen provided supervision in the whole study design and edited the drafts.  The co-authors contributed to the design and funding of the project and editing of the drafts in different levels according to their areas of expertise.     1 INTRODUCTION  Breast cancer is the most common cancer in North American women. The age adjusted annual incident rate is approximately 126 per 100,000 women, and it is estimated that a total of 204,860 women will be diagnosed with and 45,780 die of this disease in 2008 (National Cancer Institute SEER 2008 and Canadian Cancer Society 2008). From 2001-2005, the median age at diagnosis was 61 years and the median age at breast cancer death is 69 years old; the age adjusted annual death rate is 25 per 100,000 women. The mortality rates are highest among patients diagnosed at age < 35 years or >75 years (Canadian Cancer Society 2008). As a result of early detection and the application of adjuvant systemic therapy, breast cancer mortality in the Western world has nevertheless been in decline for the past decade(1). RISK FACTORS  The lifetime risk of developing breast cancer is approximately 1 in 8 women. The incidence increases with age: 11% of cases are diagnosed between the ages of 35-44, 22% between 45-54, 23% between 55-64, 20% between 65-74 and 17% between 75-84. Although the incidence of breast cancer is higher among Caucasian Americans, African Americans have a higher risk of mortality per breast cancer case(2).  Family history is also an important risk factor; approximately 5-10% of breast cancers result from inherited susceptibility. The positive association is dependent on the number of affected first and second degrees relatives, the age of the relatives at diagnosis and the age of the patient. The most common germline mutations are those of the BRCA1 and BRCA2 genes, which have normal roles in DNA repair and growth regulation(3).  2  Pre-exiting benign breast conditions can also be closely linked to breast cancer risk. Proliferative lesions without atypia (excessive growth of cells in the ducts or lobules of the breast tissues) which includes fibroadenomas, sclerosis adenosis and intraductal papilloma are associated with a 1.5 to 2-fold increase in the risk of breast cancer(4). Proliferative lesions with atypia (excessive growth of cells in the ducts or lobules of the breast tissue, and in which the cells no longer appear normal) which includes atypical ductal hyperplasia and atypical lobular hyperplasia increase the risk 4 to 5 times.  Early menarche and late menopause are risk factors, as many breast tumors are estrogen-responsive(4). The three major endogenous estrogens in women are estradiol, estriol and estrone. Estrogen is produced from the ovaries. In postmenopausal women, dehydroepiandrosterone is produced by the adrenal glands and is metabolized in peripheral fat, bone and endothelium to estradiol and estrone by the aromatase enzyme. Oral contraceptives do not increase risk of breast cancer in later life, but the risk associated with hormone replacement therapy is 1.2 times. Obesity and high fat intake also increase the risk of breast cancer, as does high alcohol consumption. In contrast, early pregnancy and breast-feeding is protective(4). Physical activity is thought to be protective but this may be dependent on the positive effect of activity on weight control. DIAGNOSIS  The three most common screening techniques are mammography(5), clinical examination and breast self-examination. Generally speaking, women with a family history should initiate screening earlier. In British Columbia, the screening program is available to self-referral for women ages 40-79 years. Possible findings from screening  3 mammography include a mass lesion, microcalcifications(4) and architectural distortation. A diagnostic mammogram, using additional views including magnification views, is undertaken after abnormal screening mammograms or clinically suspicious findings. Screening Magnetic Resonance Imaging is more sensitive but less specific than mammography. Ultrasound is useful for identifying benign cysts and fibro-adenomas, and is usually the first technique recommended to evaluate young (<30 years) women presenting with a breast mass.  However, the only way to confidently determine whether a breast mass or mammographic abnormality is benign or cancerous is to get a tissue biopsy. Fine needle aspiration(6) is probably the least resource intensive method. This technique involves using a very thin needle and syringe to remove either fluid from a cyst or clusters of cells from a solid mass. However, the cells removed by this method no longer maintain their original architecture, and this technique cannot distinguish invasive from in situ carcinoma. In addition, the accuracy of fine needle aspiration is highly dependent on operator experience. Image-guided core needle biopsy, such as stereotactic or ultrasound guided core needle, offers more definitive histologic diagnosis. This technique is less invasive than an excision biopsy. For patients with suspicious microcalcifications, stereotactic core needle biopsy is usually one of the first options. PATHOLOGY  Approximately 95% of breast carcinomas arise from epithelial tissues; the remaining 5% include lymphomas and sarcomas. In-situ carcinoma refers to the presence of malignant cells that remain confined within the ducts or lobules by the basement  4 membrane. Invasive (infiltrating) carcinoma refers to the presence of malignant cells invading beyond the basement membrane, to involve the stroma and in some cases extending into the vasculature(4). The four most frequent subtypes of carcinomas are ductal (~80%), lobular (~10%), tubular (~6%) and medullary (~2%).  On gross examination, invasive ductal carcinoma appears as a grey-white, hard mass. Microscopically, this subtype is characterized with nests and cords of cells with varying degrees of gland formation. The malignant cells induce a fibrous response in the surrounding breast tissue. Invasive ductal carcinoma can be classified into histologic grade 1-3 based on the degree of glandular differentiation, nuclear atypia and mitotic count. The Nottingham grading system(7) formalized these criteria and is a strong predictor for prognosis in breast cancer patients. Lymphovascular invasion, perineural invasion, margin status, presence of hormone receptors (estrogen receptor and progesterone receptor), and HER2 amplification are also important features included in pathology reports.  Tubular carcinoma cells are cytologically low grade, and consist exclusively of well-formed tubules(4). This subtype has a very good prognosis compared to invasive ductal carcinomas. About 95% of all tubular carcinomas express hormone receptors. There are also several rare histopathologic subtypes of breast cancer. Mucinous carcinoma occurs more frequently in older women and has a slow growth rate(4). This subtype also has a favorable prognosis. Medullary carcinoma has more aggressive pathological features, but the prognosis is slightly more favorable than invasive ductal carcinoma.  5  Biomarker evaluation is an important part of breast cancer pathology reports, because beyond conventional chemotherapy options, two types of systemic therapies are employed based on specific molecular targets expressed by many breast tumors. For the approximately two-thirds of breast tumors that are positive for estrogen (ER) or progesterone receptor (PR), endocrine therapy (tamoxifen or aromatase inhibitors) is commonly used(8). Breast cancer patients with HER2 positive disease may be treated with trastuzumab(9), a monoclonal antibody that inhibits tumors growth by binding to the HER2 oncoprotein on breast cancer cells. American Society of Clinical Oncology (ASCO) Tumor Markers Expert Panel and the joint ASCO/College of American Pathologists HER2 panel recommend measurements of these markers for all invasive breast cancers(10, 11).  ER and PR are steroid hormone receptors that reside in the cytosol and migrate to the nucleus upon binding of their ligand. They then bind to specific sequences on DNA once inside the nucleus and regulate gene expression of targets such as GATA3. ER testing is considered essential for all invasive primary breast cancers due to its strong predictive value for benefit by endocrine therapy. ER positive tumors respond well to tamoxifen(12), and retain substantial benefit from aromatase inhibitors in postmenopausal patients. From the 1970s to the early 1990s, ER levels were measured using dextran-coated charcoal (DCC) radioimmunoassay. This biochemical ligand- binding assay involved homogenization of a fresh-frozen portion of the breast tumor and then preparation of the cytosol by centrifugation. Then the cytosol was incubated with radioactively labeled estradiol. The receptor-bound estradiol was then separated from the unbound fraction with a suspension of DCC, i.e. the DCC absorbed the unbound  6 estradiol. This method required special tissue handling to set aside fresh tissue for the assay and did not reveal the morphologic context. Later, ER was shown to express exclusively in the nucleus and to be detectable with specific antibody reagents. Currently, measurement of ER is done by immunohistochemistry using a monoclonal anti-ER antibody at diagnostic laboratories. Immunohistochemistry has the advantage of showing in which cells a protein is located and can be performed on formalin-fixed, paraffin- embedded tissue. In brief, ER immunohistochemical staining involves administering the primary anti-ER antibody that binds specifically to antigens in the tissue section, followed by incubation with a secondary (usually biotin-labeled) antibody that recognizes the constant portion of the first antibody. Streptavidin coupled horseradish peroxidase then binds the biotin with high affinity, and the enzyme reacts with applied chromogen precursor (3,3’-Diaminobenzidine) to produce a brown staining, localized to the site bound by the primary and secondary antibodies. The level of ER positive expression is then assessed according to the fraction of tumor nuclear staining. ER positive tumors are more likely to be well differentiated and have inverse correlation with HER-2.  HER2 is a cell membrane surface-bound receptor tyrosine kinase involved in a signal transduction pathway for cell growth and differentiation. HER2 positive tumors are defined as gene amplification (HER2:CEP17 ratio) ! 2.0 by fluorescence in situ hybridization, and/or by protein over-expression (HercepTest" IHC score of 3+ indicating uniform, strong membranous staining). To improve the quality of HER2 testing, the ASCO and the college of American Pathologists had reformed the criteria in 2007(10). Tumors IHC scores of 2+ or with HER2 gene amplification of 1.8-2.2 are considered equivocal and require additional testing for final determination.  7 PROGNOSTIC AND PREDICTIVE MARKERS  The American Joint Committee on Cancer Staging system of invasive breast carcinomas is based on metastasis status, axillary lymph nodes and tumor size, the three strongest prognostic markers. Breast cancer patients with distant metastases are unlikely to be cured and have extremely poor prognosis. The common distant metastatic sites are brain, lung, bones, liver and adrenals. In the absence of distant metastasis, the lymph node status is the strongest prognostic factor. The 10-year relapse free survival for node negative tumors is 70-80%, but falls to 35-40% with 1-3 positive nodes and 10-15% if more than 10 positive nodes(4). The risk of lymph node metastasis increases with the size of the tumor, but tumor size is an independent prognostic marker. Patients are usually considered high risk to die of breast cancer with a tumor size ! 2 cm. Minor prognostic factors include tumor grade, lymphovascular invasion and proliferation rate. Patients with grade 3 tumors, presence of tumor cells in the lymphatics or small capillaries surrounding tumors, and/or high proliferation generally have worse prognosis(4).  Hormone receptors and HER2 are both prognostic and predictive markers. Tumors expressing ER or PR have a good prognosis whereas HER2 positive cases have poor prognosis. ER status is age dependent; approximately 80% of breast cancer patients > 65 years old have ER positive disease. ER positive status also correlates with other clinical features suggestive of favorable outcome, such as grade 1 tumors and indolent clinical history. ER is a strong predictive marker to response for adjuvant tamoxifen. One important study showed that the mortality reductions of low and high ER positive tumors were 23% and 36% respectively with a 5-year adjuvant tamoxifen administration(12). However, a portion of ER positive tumors develop recurrence, and it is becoming clear  8 that ER positive tumors are molecular heterogeneous(13, 14). PR is estrogen dependent. Tumors co-expressing ER and PR generally have a favorable outcome and are thought to respond well to adjuvant tamoxifen; the improvement in relative risk with ER positive and PR positive was 37% but with ER positive and PR negative was marginally less at 32%. Approximately 15% of breast cancers express HER2 which predicts response to trastuzumab. HER2 is also suggested to be related to endocrine resistance; clinical data from several adjuvant studies suggest that tumors co-expressing ER and HER2 show less benefit from tamoxifen(15). TREATMENT OF BREAST CANCER  Early stage breast cancer patients are treated with surgery followed then by radiotherapy, and/or adjuvant systemic therapy (hormonal therapy and/or cytotoxic chemotherapy). The use of these treatments depends on the tumor size and type, extent of positive axillary lymph nodal involvement, expression of hormone receptors (ER and/or PR) and HER2, location of the tumor in the breast, age, the physical condition of the patient and patient preference. Surgery  Women with early breast cancers usually have the option of mastectomy or breast conservation therapy(16). Modified radical mastectomy (removal of entire breast tissue along with regional lymph nodes dissection) is the standard operation for mastectomy but was historically considered as standard of care at British Columbia Cancer Agency. Today, women who require the modified radical mastectomy generally have larger, locally advanced tumors, or express a preference for this procedure.  9  Most breast tumors may be removed using breast-conserving surgery (only a part of the affected breast being removed). Segmental mastectomy (sometimes referred as lumpectomy) is wide excision for removal of the tumor and a small margin of the surrounding normal tissue, often followed by axillary node dissection. If the pathologist finds cancer cells at the margin of the removed breast tissue, the surgeon may need to remove additional tissue, a procedure known as re-excision. Radiation Therapy  Radiotherapy is recommended after breast after breast conserving surgery to reduce the risk of local recurrence(17). Breast radiotherapy refers to the use of radiation to damage the actively dividing cancer cells in the breast or the lymph nodes, and is considered to be an effective treatment for local control (reduction of local recurrence)(18, 19). An updated meta-analysis study by the Early Breast Cancer Trialists’ Collaborative Group (EBCTCG) reported that improved local control at 5 years resulted in a statistically significant improvement in both breast cancer survival and overall survival at 15 years(20). Radiotherapy after mastectomy is also recommended for women at high risk of locoregional recurrence, including those women in whom four or more lymph nodes are involved(17). At the British Columbia Cancer agency, radiation is recommended to all high-risk patients (! 4 positive nodes, any node with tumor size ! 2cm or > 50% of nodes positive) for local control. However, there are complications associated with these procedures that depend on the radiation dose and volume and also patient specific factors such as presence of comorbid disease and body habitus. The most common complications include lymphedema, pneumonitis, rib facture and cardiovascular morbidity.  10 Adjuvant Systemic Therapy  Adjuvant systemic therapy includes hormone therapy, chemotherapy or a combination of both. The primary goal of these adjuvant systemic treatments is to control undetected residual deposits of tumor after surgery and to improve overall survival by reducing the risk of recurrence(8).  Women whose tumors express estrogen receptor are treated with endocrine therapy. Hormone therapy works by depriving  the cancer cells of estrogen or modulation of cells through the ER. Tamoxifen, an estrogen receptor modulator, is only effective for ER positive tumors and works equally well for both premenopausal and postmenopausal women. Randomized trials have demonstrated that 5 years of tamoxifen is superior to 1-2 years of tamoxifen(8).  Aromatase inhibitors, another type of adjuvant systemic therapy, act by inhibiting the peripheral conversion of androgen into estrogen, but do not alter the ovarian production of estrogen. Therefore, they are only effective in post-menopausal women and have been proven to be better than tamoxifen alone in the adjuvant setting(21, 22).  Women with HER2 positive tumors are recommended to receive trastuzumab, a humanized monoclonal antibody that targets the epidermal growth factor receptor 2, in addition to chemotherapy. Data from an interim analysis of two randomized clinical trials suggested that trastuzumab combined with paclitaxel after doxorubicin and cyclophosphamide improves the outcome among women with HER2 positive tumors(23). There was an absolute 12% difference in disease-free survival at three years between the trastuzumab and the control group.  11  Chemotherapy is used in both premenopausal and postmenopausal women with high-risk breast tumors. The choice of drug used depends on the individual because there are different side effects associated with the different drugs. The Early Breast Cancer Trialists’ Collaborative Group meta-analyses have concluded that anthracycline containing chemotherapy regimens such as FAC (fluorouracil, doxorubicin, cyclophosphamide) or FEC (fluorouracil, epirubicin, cyclophosphamide) are more effective than CMF (cyclohosphamide, methotrexate, fluorouracil) chemotherapy. Anthracycline-based chemotherapy reduces the annual breast cancer death rate by approximately 38% for women younger than 50 years old and by 20% for those of age 50-69 years old at diagnosis(8). The National Cancer Institute of Canada MA5 trial has also demonstrated that a CEF (cyclophosphamide, epirubicin, fluorouracil) regimen is associated with improved relapse-free and overall survival compared with CMF(24). A retrospective study of the MA5 trial data suggests that HER2 positive tumors may be more sensitive to anthracyclines than CMF(25). Subsequent meta-analysis suggests that HER2 status predicts benefit from anthracycline-based chemotherapy(26). Data from randomized trials of the Cancer and Leukemia Group B (CALGB) and US Breast Cancer Intergroup suggests that chemotherapy improves significantly the outcome of patients with ER negative tumors(27). Benefit from the addition of paclitaxel to AC (doxorubicin, cyclophosphamide) chemotherapy was observed in patients with ER negative HER2 positive tumors, but not ER negative HER2 negative tumors in an exploratory analysis of CALGB 9344 data(28).  The selection of adjuvant systemic therapy has evolved from being based on risk assessment by tumor size, nodal status and metastasis status, to targeted therapies for ER  12 and HER2 positive tumors. However, there is still no validated marker that can predict sensitivities of chemotherapy in patients. Advances in technology, such as gene expression profiles will hopefully help identify gene signatures to predict treatment response and breast cancer subgroups that are more likely to respond targeted therapies or chemotherapy and finally new molecular targets for drug development. THESIS OBJECTIVES  The concept of intrinsic biological breast cancer subtypes has been of great clinical interest since its first introduction(13). However, gene expression profiling by DNA microarray has not proven to be a clinically practical assay method. Histopathology studies on formalin-fixed, paraffin-embedded tumor tissues remains the gold standard for breast cancer diagnosis at this moment. Immunohistochemistry is currently the best available method to assess the expression of proteins in relation to the specific histomorphology of breast cancer. Using tissue microarrays of 930 invasive breast cancer patients, Torsten O. Nielsen in collaboration with Charles M Perou of University of North Carolina, Chapel Hill, has developed an immunohistochemical panel of four antibodies (estrogen receptor (ER), HER2, Epidermal Growth Factor Receptor, cytokeratin 5/6)(29) that allows for the identification of the three major intrinsic subtypes (Luminal, Basal-like and HER2+). This also leads to the finding that the basal-like subtype characterized with poor prognosis is more frequent in pre-menopausal African Americans versus post-menopausal African Americans and Caucasians of any age(30).   We believe that a limited additional set of immunohistochemical markers can be developed to discriminate the ER+/Luminal A and ER+/Luminal B subtypes. Although  13 ER positive breast tumors generally have good survival probability, and respond well to hormonal therapy, we hypothesize that the two ER+ subtypes identified by additional biomarkers will have significantly different prognosis. Finally, we also aim to validate the superior prognostic value of our previous surrogate immunopanel to identify aggressive basal-like tumors in comparison with a current clinical definition, the triple negative phenotype (negative for all ER, PR and HER2).   In order to do so, we constructed clinically annotated tissue microarrays representing 4444 breast cancers to test our hypotheses. Tissue microarrays permit high throughput analysis of standard formalin-fixed, paraffin-embedded clinical material, enable direct correlation of biomarkers with histopathology (the diagnostic gold standard) and minimize consumption of valuable tissue resources. The study population was derived from breast cancers diagnosed between 1986 and 1992 in British Columbia whose tumors were tested by the central ER laboratory at Vancouver General Hospital. Data linkage was done by the Breast Cancer Outcomes Unit, which maintains a population-based database of patients with breast cancer referred to the British Columbia Cancer Agency. About 75% of all breast cancers diagnosed in BC were referred to BCCA within a year of diagnosis during the study period(31). Abstracted clinical information includes age, sex, menopausal status, date of diagnosis, histology, grade, tumor size, number of involved axillary nodes, presence of lymphatic or vascular invasion adjacent to the primary tumor, type of local therapy, type of initial adjuvant systemic therapy, dates of first local, regional, distant recurrence and deaths and cause of  14 death. This study was approved by the human research ethics institutional review board of the University of British Columbia and BC Cancer Agency.   This was a multidisciplinary project that required me to gain biostatistics skills, knowledge of breast cancer biomarkers and adjuvant systemic therapies by reading and interacting with experts in each area.      15 References:  1. Berry DA, Cronin KA, Plevritis SK, Fryback DG, Clarke L, Zelen M, et al. Effect of screening and adjuvant therapy on mortality from breast cancer. N Engl J Med 2005;353(17):1784-92. 2. Newman LA, Mason J, Cote D, Vin Y, Carolin K, Bouwman D, et al. African- American ethnicity, socioeconomic status, and breast cancer survival: a meta-analysis of 14 studies involving over 10,000 African-American and 40,000 White American patients with carcinoma of the breast. Cancer 2002;94(11):2844-54. 3. Turner N, Tutt A, Ashworth A. Hallmarks of 'BRCAness' in sporadic cancers. Nat Rev Cancer 2004;4(10):814-9. 4. Robbins and Cotran Pathologic Basis of Disease. 7th ed. Philadelphia, PA: Elsevier Saunders; 2005. 5. Shen Y, Yang Y, Inoue LY, Munsell MF, Miller AB, Berry DA. Role of detection method in predicting breast cancer survival: analysis of randomized screening trials. J Natl Cancer Inst 2005;97(16):1195-203. 6. Britton PD. Fine needle aspiration or core biopsy. The Breast 1999;8(1):1-4. 7. Galea MH, Blamey RW, Elston CE, Ellis IO. The Nottingham Prognostic Index in primary breast cancer. Breast Cancer Res Treat 1992;22(3):207-19. 8. Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005;365(9472):1687-717. 9. Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med 2001;344(11):783-92. 10. Wolff AC, Hammond ME, Schwartz JN, Hagerty KL, Allred DC, Cote RJ, et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for human epidermal growth factor receptor 2 testing in breast cancer. J Clin Oncol 2007;25(1):118-45. 11. Harris L, Fritsche H, Mennel R, Norton L, Ravdin P, Taube S, et al. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J Clin Oncol 2007;25(33):5287-312. 12. Tamoxifen for early breast cancer: an overview of the randomised trials. Early Breast Cancer Trialists' Collaborative Group. Lancet 1998;351(9114):1451-67. 13. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 14. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001;98(19):10869-74. 15. Dowsett M, Harper-Wynne C, Boeddinghaus I, Salter J, Hills M, Dixon M, et al. HER-2 amplification impedes the antiproliferative effects of hormone therapy in estrogen receptor-positive primary breast cancer. Cancer Res 2001;61(23):8452-8. 16. Singletary SE. Breast cancer management: The road to today. Cancer 2008;113(S7):1844-1849.  16 17. Eifel P, Axelson JA, Costa J, Crowley J, Curran WJ, Jr., Deshler A, et al. National Institutes of Health Consensus Development Conference Statement: adjuvant therapy for breast cancer, November 1-3, 2000. J Natl Cancer Inst 2001;93(13):979-89. 18. Favourable and unfavourable effects on long-term survival of radiotherapy for early breast cancer: an overview of the randomised trials. Early Breast Cancer Trialists' Collaborative Group. Lancet 2000;355(9217):1757-70. 19. Fisher B, Anderson S, Bryant J, Margolese RG, Deutsch M, Fisher ER, et al. Twenty-year follow-up of a randomized trial comparing total mastectomy, lumpectomy, and lumpectomy plus irradiation for the treatment of invasive breast cancer. N Engl J Med 2002;347(16):1233-41. 20. Clarke M, Collins R, Darby S, Davies C, Elphinstone P, Evans E, et al. Effects of radiotherapy and of differences in the extent of surgery for early breast cancer on local recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005;366(9503):2087-106. 21. Howell A, Cuzick J, Baum M, Buzdar A, Dowsett M, Forbes JF, et al. Results of the ATAC (Arimidex, Tamoxifen, Alone or in Combination) trial after completion of 5 years' adjuvant treatment for breast cancer. Lancet 2005;365(9453):60-2. 22. Thurlimann B, Keshaviah A, Coates AS, Mouridsen H, Mauriac L, Forbes JF, et al. A comparison of letrozole and tamoxifen in postmenopausal women with early breast cancer. N Engl J Med 2005;353(26):2747-57. 23. Romond EH, Perez EA, Bryant J, Suman VJ, Geyer CE, Jr., Davidson NE, et al. Trastuzumab plus adjuvant chemotherapy for operable HER2-positive breast cancer. N Engl J Med 2005;353(16):1673-84. 24. Levine MN, Pritchard KI, Bramwell VH, Shepherd LE, Tu D, Paul N. Randomized trial comparing cyclophosphamide, epirubicin, and fluorouracil with cyclophosphamide, methotrexate, and fluorouracil in premenopausal women with node- positive breast cancer: update of National Cancer Institute of Canada Clinical Trials Group Trial MA5. J Clin Oncol 2005;23(22):5166-70. 25. Pritchard KI, Shepherd LE, O'Malley FP, Andrulis IL, Tu D, Bramwell VH, et al. HER2 and responsiveness of breast cancer to adjuvant chemotherapy. N Engl J Med 2006;354(20):2103-11. 26. Gennari A, Sormani MP, Pronzato P, Puntoni M, Colozza M, Pfeffer U, et al. HER2 status and efficacy of adjuvant anthracyclines in early breast cancer: a pooled analysis of randomized trials. J Natl Cancer Inst 2008;100(1):14-20. 27. Berry DA, Cirrincione C, Henderson IC, Citron ML, Budman DR, Goldstein LJ, et al. Estrogen-receptor status and outcomes of modern chemotherapy for patients with node-positive breast cancer. JAMA 2006;295(14):1658-67. 28. Hayes DF, Thor AD, Dressler LG, Weaver D, Edgerton S, Cowan D, et al. HER2 and response to paclitaxel in node-positive breast cancer. N Engl J Med 2007;357(15):1496-506. 29. Nielsen TO, Hsu FD, Jensen K, Cheang M, Karaca G, Hu Z, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res 2004;10(16):5367-74. 30. Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, Conway K, et al. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 2006;295(21):2492-502.  17 31. Olivotto A, Coldman AJ, Hislop TG, Trevisan CH, Kula J, Goel V, et al. Compliance with practice guidelines for node-negative breast cancer. J Clin Oncol 1997;15(1):216-22.    18    CHAPTER TWO  GENE EXPRESSION PROFILING OF BREAST CANCER 1      This work represents a comprehensive review of current published works on gene expression profiling of breast cancer as well as the importance of intrinsic breast cancer subtypes. A brief introduction of the common data analyses methods is also included in the content. Reprinted with the permission of Annual Reviews.   1  A version of this chapter has been published. (Cheang MC, van de Rijn M, Nielsen TO. Gene Expression Profiling of Breast Cancer. Annu Rev Pathol. 2008;3:67-97)  19 GENE EXPRESSION PROFILING OF BREAST CANCER  Since the introduction of DNA microarray(1, 2) technology in the mid-1990s, breast cancer has probably been the carcinoma most intensively studied by gene expression profiling. DNA microarrays allow researchers to measure the expression of tens of thousands of genes concurrently in one tissue sample. Enormous amounts of data are thereby generated, making microarrays an excellent tool for screening studies, for hypothesis generation, and for elucidating broad patterns of gene expression in health and disease. Over the past decade, there have been increasing numbers of reports describing gene signatures that help to explain the biology of breast cancer and/or have potential clinical value for prognosis, for predicting response to treatment, or for identifying therapeutic targets for drug development. Nevertheless, controversy remains about the validity and reproducibility of the findings, in large part because of the nature of these studies, in which thousands of genes are being assessed on a necessarily much smaller number of specimens(3). Owing to the risk, with this type of data set, of generating false inferences(4), an entire new field of bioinformatics has emerged, and the best data analysis tools to use are currently a matter of intense debate. Stringent study design and data interpretation and, most importantly, validation of results on independent cohorts of patient samples, are required for this technology to advance the clinical management of breast cancer(3). Such studies are now underway.  20 EXPRESSION PROFILING METHODOLOGY Microarray Platforms Used in Breast Cancer Studies   In breast cancer research to date, the most commonly used platforms for gene expression profiling have been spotted cDNA(1) and oligonucleotide microarrays(5). Each platform yields measurements of mRNAs present in a tumor specimen, and requires high-quality total RNA that is best isolated from fresh-frozen tissue. The frozen tumors must be handled carefully and consistently during RNA extraction(6) to optimize the integrity and reliability of data for subsequent exploratory data analysis and to allow biologically relevant statistical inferences to be drawn.   Although several platforms for microarray measurements are in use, a mainstay of publicly available sequence information has been the spotted cDNA microarray. Probes are derived from established cDNA libraries, typically encoding 500 to 2000 nucleotides from the 3' regions of transcripts. The sequences are amplified by polymerase chain reaction, then spotted on the surface of a chemically coated microscope glass slide in an array format. These immobilized cDNA sequence probes will bind to fluorescently labeled complementary sequences of corresponding gene targets present in a biological sample. Hybridization characteristics can be different between probes because of variations in spot size, probe length, and G-C content. As a result, the absolute measurements of staining intensities are not strictly comparable among probe spots and are therefore measured relative to a reference sample RNA pool. To label the genes expressed in tumor samples, the purified RNA sample is first reverse transcribed,  21 incorporating a red-fluorescing nucleotide (Cy-5-dUTP), then cohybridized onto the microarray slide along with green fluorescent (Cy-3)--labeled reference RNA. The fluorescent signals are measured by emissions at the appropriate wavelengths, and the digitally scanned images are quantified. The two channel images are superimposed, to give a preliminary visual coloring readout representing the gene expression of each spot (Figure 1.1a). In this manner, each sample analyzed can generate in the range of 40,000 data points.   The disadvantages of cDNA microarrays, which include repetitive DNA sequences, poorly characterized genes, and variable hybridization kinetics, can be avoided by using fully defined oligonucleotides that represent unique sequences already mapped to the genome. Defined oligonucleotides, typically 30- to 70-mers, are spotted on glass slides using ink jet technology, and the readouts can be interpreted in a similar fashion as those from cDNA microarrays. Agilent-type arrays [e.g., MammaPrint(7)] and nonproprietary human exonic evidence-based oligonucleotide arrays being printed at several academic centers (http://www.microarray.org/sfgf/heebo.do) are examples currently in widespread use. High-density oligonucleotide microarrays, exemplified by Affymetrix GeneChip!(5) technology, use photolithography and combinatorial chemistry to enable synthesis of probes on a quartz wafer. Typical Affymetrix chips contain between 10 5  to 10 6  defined probe elements. In contrast to spotted oligonucleotide microarrays, the probes on these ultrahigh-density arrays are 25 nucleotides long, with genes represented by a set of 10--20 pairs of perfect match and mismatch probes. The perfect match probes provide quantitative measurements of fluorescent target sample  22 binding, and the mismatch probes serve as internal controls. These high-density microarrays are more expensive than spotted microarrays and are not readily customizable. Affymetrix GeneChip! assays incorporate an amplification step and need less total RNA template from each tumor sample, 1--2 µg compared with 10--20 µg needed for spotted cDNA microarrays. Biostatisticians have also prepared open-source statistical procedures written especially for handling such data (e.g., BioConductor, http://www.bioconductor.org). Alternatively, all the above types of microarrays can be used to measure DNA copy number changes, a technique known as array-based comparative genomic hybridization(8, 9).   The identification of limited sets of key genes within expression signatures has provided a justification for collecting focused expression profiles (dozens to hundreds of genes) through quantitative real-time polymerase chain reaction (qRT-PCR)(10, 11). The most relevant genes of interest are deduced from microarray studies and other previous investigations. Using unique primer pairs specific for each gene, mRNA transcripts are amplified in a fashion that allows a very accurate quantization of expression (Figure 1.2). This methodology is applicable to very small specimens (e.g., fine needle aspirate or core biopsy material) and, with proper primer design and optimization, can be applied to formalin-fixed, paraffin-embedded tissues. The choice of genes to include in the assay [e.g., optimized for outcome prediction in a particular clinical setting(12) versus based primarily on biologic differences] can be difficult, but becomes critical for developing a clinically useful breast cancer assay by this approach.  23 Data Processing  The massive amount of data created by microarray technologies fosters close collaboration between experimental biologists and statisticians. Commercial software (e.g., the GeneSpring Analysis Platform from Agilent Technologies) and open-source code (e.g., BioConductor, http://www.bioconductor.org) are available for analysis of genome-wide expression data. The open-source projects provide a wide range of powerful statistical and graphical methods for DNA microarray data analysis and promote high-quality documentation of the methods to facilitate reproducibility. Users thereby can apply the most up-to-date methods, without the worry of additional costs of add-on modules required with commercial software. However, scientists who want to engage in computational biology need to develop some programming skills. More user- friendly software freely available for academic users includes dChip (DNA-Chip Analyzer, http://biosun1.harvard.edu/complab/dchip/) and BRB ArrayTools (http://linus.nci.nih.gov/BRB-ArrayTools.html).   Both spotted cDNA and high-density oligonucleotide array data are subject to variations during array manufacturing, RNA isolation, reverse transcription and labeling, sample hybridization, and image analysis; each of these can impact the gene measurements and create artifactual expression signatures. Normalization is a vital first step to minimize such variation, but different normalization methods can lead to different interpretations of the data. Spotted cDNA and oligonucleotide microarrays use one probe per gene and generate two channel measurements that assume the intensity of each probe is proportional to the amount of target; it is therefore essential to correct measurements  24 by subtracting the background intensity in the spot region. A normalization method is then done within each array to eliminate systemic variations in dye biases(13).   Because data analysis invariably compares results from multiple array experiments, a scaled normalization is carried out between arrays to adjust the distribution of logarithm intensity ratios to have a median of zero for each array. Most normalization methods are done by local regression(14, 15) although more sophisticated statistical methods(13, 16, 17) have also been proposed. The Affymetrix GeneChip employs multiple probes for each gene and a single-color detection method. Background intensity correction is also the first step(18, 19). Affymetrix GeneChip data can then be normalized across arrays, with various approaches available such as applying simple linear scaling in Affymetrix Microarray Suite software, nonlinear smooth curves using “rank invariant set” based on housekeeper control genes(20), and probe-level quantile normalization(19). As Affymetrix employs a set of several probes to represent one gene, summation of data is used to determine the corresponding mRNA quantities(19, 21). Compared with spotted cDNA and oligonucleotide microarrays, high-density oligonucleotide microarray data processing is less intuitive to many experimental biologists, and choosing the most appropriate normalization methods can be challenging. Presentation of Primary Data  Clearly issues such as normalization and the correct interpretation of these very large data sets are a subject of ongoing discussion. This is one major reason that the consensus among the microarray research community is that raw data supporting published studies should be made publicly available, allowing researchers to use their  25 own techniques to mine data sets and compare studies. In 2001, the Microarray Gene Expression Data Society proposed experimental annotation standards, known as minimum information about a microarray experiment, and these standards are now supported by leading journals. Studies undertaken to compare data from different gene expression microarray platforms have found poor correlation between mRNA measurements from matched genes(22), and differences in data analysis approaches also create apparent poor correlations across platforms(23). However, recent large-scale studies have supported technical consistency across platforms and laboratories(24).   In breast cancer, concerns have been raised about the disappointingly small number of overlapping genes in different published gene expression predictors, leading to questions about the applicability of using microarrays as a platform for developing consistent assays of cancer biology. The most powerful method to confirm the relevance of a particular gene set is to determine how expression levels for that gene set predict outcome in samples analyzed at a different institution using a different platform. In response, an important study tested five published gene expression predictors of breast cancer prognosis(25)---the 70-gene profile(26, 27), activated wound response(28, 29), recurrence score(12), intrinsic biological subtypes(30-33), and two-gene ratio(34)--- against a common set of 295 primary breast cancers(25). Results suggest that four of the five tested models show significant agreement in outcome predictions for individual patients.  Although some cancer microarray expression profile signatures fail to be confirmed, possibly because initial gene lists were developed using inadequate, purely internal training-validation designs(35), there are many that can be validated by  26 comparisons with other gene array study sets or through complementary techniques such as tissue microarrays (TMAs). Below, these individual gene expression predictors are discussed in more detail. However, one very important conclusion that can be drawn is that public access to the primary data (e.g., Figure 1.1b) not only makes that data set accessible for reanalysis with new bioinformatic tools, but perhaps even more importantly provides a data set useful for external validation studies of other signatures--- the most critical step in discerning their potential clinical utility. Selected, large, publicly accessible repositories of breast cancer expression profiling data are shown in Table 1.1. Statistical Methods  After proper data processing and normalization, microarray expression profiling data are explored to seek biologically and clinically meaningful patterns. Bioinformaticians have developed many statistical methods to deduce gene expression predictors, which must be carefully applied to avoid problems. In primary analyses without an external validation set, there is a risk that genes found to be of interest in the initial set may not be significant in other samples, a phenomenon referred to as overfitting(36).   Older papers in the field often merely extracted the genes with the most extreme differences in expression or associations with clinical parameters, but global analyses of patterns within the data take much better advantage of the genome-wide scale of microarray data sets. Cluster analysis(37-40) is one of the most widely used multivariate methods employed in breast cancer gene expression studies to organize genomic-scale data. This method groups genes on the basis of their overall similarity in expression  27 patterns across specimens, and tumor samples on the basis of their similarities in gene expression. Two popular clustering algorithms are hierarchical clustering and k-means clustering.   Hierarchical clustering(41-43) starts by assigning each subject as a cluster, each containing one subject. Then the closest and most similar pair of clusters are merged into one new cluster. The similarities between the new cluster and each of the old clusters are computed again, and this is repeated until all subjects are clustered into a single overarching cluster of all genes or all tumors. A dendrogram tree serves as a graphical summary of the data, with the lengths of branches representing the degree of relationship between genes or subjects (Figure 1.2f). One weakness of hierarchical clustering is that small changes in the data can lead to quite different dendrogram relationships, and this kind of summary is valid only when the pairs-wise dissimilarities between objects follow the hierarchical structure resulting from the algorithm(44, 45).  k-means clustering(46) attempts to segment N subjects into a predefined number partitions. In brief, the operator first chooses the number of clusters, K. Then the algorithm randomly generates K clusters and computes the cluster centers in multidimensional space, known as centroids. Each subject is assigned to the nearest centroid, new centroids are computed, and these steps are iterated until convergence criteria are met (e.g., assignment of subjects does not change). The relative simplicity of this k-means algorithm is favored for its speed in handling large data sets. The disadvantage is that the initial selected number of clusters is somewhat arbitrary, and, if  28 changed, cluster memberships change and may not be nested within the previously assigned clusters.   In 1998, Eisen et al.(47) implemented these procedures in a publicly available software known as Cluster, which allows experimental biologists with minimal knowledge of statistical programming to carry out these analyses. A Java-based software, Treeview(48), was similarly made available to display the clustered profiles in a visually appealing manner, often referred as a heatmap. This graphical display is a useful tool for a two-dimensional representation of the patterns of differential gene expression. The identification of breast cancer intrinsic biological subtypes(31, 33), discussed below, used these methods.   Unsupervised forms of data analysis such as hierarchical clustering permit discovery of inherent data patterns and relationships among genes and tumors, for example, highlighting the previously unrecognized basal-like type of breast cancer(31). By comparison, supervised data analysis techniques identify genes whose expression most closely tracks with a known variable of interest such as tumor grade or patient outcome. An example of a result from supervised analysis is the 70-gene breast cancer prognostic signature(26), which linked tumor gene expression (among node-negative patients under age 55 with tumors less than 5 cm in size) to the presence or absence of distant metastases at 5 years.   29  The advantage of supervised approaches is that the extracted gene list will almost certainly link to the clinical question being addressed. The biggest disadvantage is that because gene selection is typically optimized against outcome from a set of patients with certain tumor characteristics and treatment regimens, the ability to generalize outside the patient set from which it is derived may be limited.   False discovery risks need to be quantified and limited to optimize gene lists(49), for example, through the sophisticated semisupervised clustering and supervised principal component methods, proposed by Bair & Tibshirani(50), using the nearest shrunken centroids. Such methods can serve, for example, as a method for developing a clinically practical customized mini-array or qRT-PCR assay sufficient to classify breast tumors into biological subtypes and to predict outcome. Other supervised microarray data analysis options include self-organizing maps(51), support vector machines(52), artificial neural networks(53), Significance Analysis of Microarrays(54), and Gene Set Enrichment Analysis(55). Regardless of the method chosen, for both unsupervised and, particularly, supervised approaches, results from genomic-scale experiments still need to be validated on independent series of tumors. BREAST CANCER EXPRESSION PROFILES Studies to Understand Breast Tumor Biology  The ability to measure concurrently the expression levels of tens of thousands of genes in cell lines and tumor specimens has, not surprisingly, expanded our understanding of the biology underlying breast cancer. The first published studies, from  30 the Brown and Botstein labs at Stanford, showed that cDNA microarrays could identify expression signatures specific to breast cancer cells(56, 57). A similar study showed that gene expression patterns can predict the invasive capacity of breast cancer cell lines(58).   The first major published study on gene expression profiling of large numbers of clinically annotated breast tumor tissue specimens came out in 2000, wherein Perou et al. (31) demonstrated a biological classification of breast cancers on the basis of distinctive patterns of gene expression. This study applied unsupervised hierarchical clustering to group 65 breast surgical excision specimens from 42 unique patients, using a condensed list of intrinsic genes. The intrinsic genes were defined as those showing significant variation in expression across different tumors but not between paired samples from the same tumor; this is one method to limit analysis to the subset of genes whose variation is more likely a product of biological differences between tumor specimens from different patients than of changes induced by individual patient or technical variability. These molecular interpretations, subsequently validated by studies from Sorlie et al.(32, 33), correlated with patient outcome and, in addition to recognizing known clinically important subgroups, also indicated the existence of novel subtypes of breast carcinoma.   The five major biological breast cancer subtypes so derived are termed Luminal A, Luminal B, HER2 overexpressing, basal-like, and normal-like; their variations in growth rate, specific signaling pathways, and cellular composition of tumors can be explained, at least in part, by their corresponding gene expression patterns. In particular, the proliferation genes are elevated in the three clinically aggressive subtypes: Luminal B, basal-like, and HER2 overexpressing breast cancers. In subsequent studies, these  31 signatures have shown significant reproducibility in predicting patient survival in different cohorts of patients, by different laboratories using different gene array platforms and by novel unsupervised statistical methods(59-62).   One particularly important finding of this research was the highlighting and characterization of the basal-like subgroup of breast cancers, a biologically distinctive group for which targeted therapies are currently unavailable(63). The existence of the novel subtype of basal breast carcinoma was confirmed not only by independent gene array studies but also by a series of immunohistochemical studies on TMAs containing breast carcinoma samples with known clinical features(64-66). The intrinsic subtypes identified by gene expression profiling also correlate with genomic copy number aberrations identified by array-based comparative genomic hybridization(67).   Another interesting breast cancer gene expression pattern reflecting the underlying biology of cancer is the wound-response signature identified by Chang et al. (28, 29), which supports the concept [proposed by H. F. Dvorak in 1986 (68)] that a molecular program activated in normal wound healing is co-opted during cancer invasion and progression. This transcriptional signature was derived from cultured, serum- stimulated fibroblasts(29), and expression of this signature predicted poor clinical outcome among 295 early breast cancer tumors, improving the prediction of clinical outcome over that reflecting the risk stratification provided by clinical risk factors(28). The same signature also predicted poor outcome in gastric and lung carcinoma, indicating that the wound response is shared by a variety of tumor types and highlighting how investigations of breast cancer expression profiles have implications for other kinds of  32 cancer. A similar approach was used to identify a hypoxia response signature from cultured mammary and renal tubular epithelial cells, and when present in breast cancer specimens, this gene signature is significantly associated with poor clinical outcome, independent of the wound healing signature and of standard clinical parameters(69).   A recent study has proposed an “invasiveness” gene signature based on 186 genes, which includes apoptosis genes BCL2 and CASP8, chemotaxis genes PLP2 and CXCL2, and proliferation genes SSR1, EMP1, and ERBB4, selected because they are differentially expressed between tumor-derived CD44+/CD24- highly tumorigenic breast cancer cells in comparison with reduction mammoplasty-derived normal breast epithelium cells(70). This signature is designed to track a minority population of breast cancer cells that behave like stem cells in mouse xenograft assays(71-73). The ”invasiveness” gene signature shows strong association with clinical outcomes in breast cancers, consistent with the underlying idea that stem cells are present and important in breast cancer development(74, 75).   Intriguingly, similar to the wound-response signature mentioned above, this signature also has prognostic value when applied to other malignancies, including lung cancer, prostate cancer, and medulloblastoma data sets, suggesting it identifies a feature of general importance in cancer(70). The ”invasiveness” gene signature probably tracks a different process (more likely to reflect cancer stem cells) than the wound healing response (a nonoverlapping signature more directly linkable to local invasion).  33 Combining these two results in even more impressive cancer risk stratification(70), which can be interpreted as supporting the “seed and soil” hypothesis of cancer(76).   Hereditary cases account for 5%--10% of breast cancers. BRCA1 and BRCA2 are the two susceptibility genes whose germ-line mutations are involved in most familial breast cancers, and their mutation status is reflected in somatic tumor gene expression profiles(32, 77). A related gene array study compared BRCA1-mutated breast cancers with matched sporadic tumor specimens(78), and reported that a subset of genes involving cell adhesion and migration such as laminins, different collagens, and fibronectins characterizes the BRCA1-associated tumors. Familial non-BRCA1/BRCA2- mutated breast cancers segregate into two distinct subclasses on the basis of gene expression profiling, and these two subclasses do not cluster together with BRCA1- or BRCA2-mutated familial cases(79). These findings encourage the search for novel breast cancer predisposition genes among non-BRCA1/BRCA2-mutated familial breast cancer and should facilitate such studies, by allowing identification of similar cases within a heterogeneous background. Studies to Improve the Diagnosis of Breast Cancer  Whereas gene expression profiling undoubtedly provides valuable biologic information to advance our understanding of breast cancer development and progression, its value as a clinical tool is less well proven. The diagnostic setting provides the most direct opportunity to translate gene expression information into clinical use. One relatively straightforward approach is to inspect the expression profile for biomarkers recognized by existing diagnostic immunohistochemistry antibodies. In this manner, for  34 example, a simple immunohistochemistry panel was developed to identify the basal-like subtype of breast cancer(65). Breast cancer gene expression profiles are maintained at sites of distant metastasis(80), implying that such tests may even have some value in the setting of metastatic carcinoma of unknown primary sites.    Disrupted p53 function plays an important role in tumor progression and resistance to therapy(81-83). However, conventional detection methods such as immunohistochemistry cannot reliably differentiate wild-type versus mutant versus deleted p53, leading to inconsistent and sometimes misleading research results. A 32- gene p53 signature, built from Affymetrix U133 expression profiles of 251 TP53- sequenced breast tumors, distinguishes p53-mutant from p53-wild-type tumors in multiple independent data sets, and does a better job predicting clinical outcome than mutation status(84). The relevant p53 signature genes are not canonical targets of p53, but rather are associated with cell proliferation and growth, transcription, ion transport, and breast cancer biology. The signature thus reflects net changes in several cancer biology pathways induced by p53 dysregulation, a type of analysis for which expression profiling is better suited than other methods.   Gene expression profiling studies have also been linked to breast cancer histology. Invasive lobular carcinomas generally proliferate more slowly than ductal tumor, lose E-cadherin expression(85), and are more often estrogen receptor (ER) and progesterone receptor (PR) positive(86). They can show histological overlap with ductal carcinomas and are not always reliably distinguished in practice; perhaps because of this,  35 invasive lobular and ductal carcinomas are treated similarly on the basis of stage, ER, and HER2 (human epidermal growth factor receptor 2 oncoprotein, product of the ERBB2 gene) status.   Using spotted cDNA microarrays, a supervised analysis of 17 lobular versus 106 ductal breast cancers reported a signature of 11 genes (E-cadherin, survivin, cathespin B, TP11, SPRY1, SCYA14, TFAP2B, thrombospondin 4, osteopontin, HLA-G, and CHC1) to distinguish lobular and ductal subtypes(87). This signature awaits external validation and clinical translation; a similar study found less clear-cut differences between lobular and ductal carcinomas(88).   Inflammatory breast cancer, in contrast, is easy to identify histologically and has underlying expression signatures that reflect the general biological subtypes seen in noninflammatory cases(89, 90). Using an 18,000 feature cDNA array from the Netherlands Cancer Institute, Hannemann et al.(91) recently presented publicly available data from a comparison of forty cases of ductal carcinoma in situ with an equal number of invasive breast cancers. The authors present a classifier of 35 genes that could stably distinguish between groups with 91% accuracy, among which transforming growth factor-!2 and matrix metalloproteinase 11 were prominent members. The authors further described a classifier that distinguishes well-differentiated from poorly differentiated ductal carcinoma in situ. These findings were validated by internal statistical methods, but again await external validation and clinical translation.  36  Returning to the topic of hereditary breast cancer, the current diagnostic test for BRCA1/BRCA2 mutation is time and labor intensive. Given that subtle changes at the DNA level may induce more profound and widespread changes at the RNA level, gene expression profiling offers a potential alternative to identifying mutation carriers. Indeed, not only do the hereditary breast tumors themselves bear characteristic expression signatures(77-79), non-neoplastic fibroblasts from BRCA1 and BRCA2 mutation carriers also present distinct gene expression signatures after radiation-induced damage, consistent with an altered DNA repair response(92). If validated, this information could be used to create a screening test based on the net functional defect in these patients, which cannot always be inferred even from sequencing. Studies to Identify Prognostic Signatures  Histological grade, based on mitotic index, nuclear pleomorphism, and architectural differentiation, is one of the most important prognostic factors for breast cancers, yet has only a moderate level of interobserver agreement among pathologists (93). Thirty to sixty percent of breast cancers are graded as moderately differentiated (Grade 2), and this group of tumors shows intermediate outcome, providing relatively little guidance for adjuvant treatment decisions.   Using Affymetrix U133A microarrays and a rigorous training/test/external validation strategy, Sotiriou et al.(94) identified distinct gene expression patterns from 97 unique genes differentiating Grade 1 from Grade 3 tumors. These genes, including UBE2C, KPNA2, TPX2, FOXM1, STK6, CCNA2, BIRC5 (survivin), and MYBL2 function in cell proliferation and progression, and are in general more highly expressed in Grade 3  37 tumors. A scoring system, termed gene expression grade index, was developed from this data and serves to classify Grade 2 breast carcinomas into two risk groups, which are significantly associated with distinct relapse-free survival, independent of standard clinical parameters and ER status. In multivariate analysis, gene expression grade index had a higher hazard ratio than other prognostic factors (including nodal status) and displaced histologic grade from the Cox model. Ivshina et al.(95) reported similar findings shortly thereafter from a concurrent, independent study.   Breast tumors identified as Luminal A in other microarray studies have a low gene expression grade index compared with other luminal tumors(96), supporting the idea that expression levels of proliferation genes are useful in defining the poor-outcome Luminal B subtype. This finding is consistent with Dai et al.’s(97) results, which demonstrated that proliferation gene expression signatures are most powerfully prognostic among older, ER-positive patients. Although pathologists can assess proliferation activity by mitotic counts and/or Ki67 immunohistochemistry, fixation issues and variability in visual counting methods probably explain why a gene expression panel seems to do a significantly better job at this task. For the moment, histological grading is so cheap and convenient that it will be hard to replace even by better measurements. However, if gene expression profiling in some form becomes more clinically accessible, proliferation signatures appear poised to be an important application in breast cancer analysis.   The 70-gene prognostic signature, identified from the Netherlands Cancer Institute primary breast cancer casesi(26, 27), is an excellent example of applying  38 supervised learning on gene expression profiles. This signature, showing prognostic value for distant metastasis within five years, was first identified using a cohort of 78 node- negative breast cancers occurring in women below age 55 who had not received systemic adjuvant therapy, using oligonucleotide microarrays(26). Many genes involving the hallmarks of cancer are represented: cell cycle, metastasis, angiogenesis, and invasion. This prognostic profile was then validated on a cohort of 295 young patients from the same institution, including node-positive and node-negative tumors with and without systemic treatment(27).   The prognostic value of the 70-gene signature was significantly better at predicting distant relapse-free survival than standard St. Gallen or National Institutes of Health clinical criteria. As this study included the original data set used to derive the prognostic value in its initial validation, a concern of possible data overfitting has generated debate in the microarray community(36), and it remains unclear if this signature really does perform better than the Nottingham prognostic index(98). Nevertheless, the authors of this study made their primary data publicly accessible, making this data set a particularly valuable resource for validation and comparative studies by others in the field(25, 28, 35, 69, 99, 100). A Danish group confirmed the prognostic value of the 70-gene signature in low-risk primary breast cancers, but presented a 32-gene classifier that appeared to be better among the very lowest-risk subset(101), a result that requires external validation.   39  The 76-gene Rotterdam signature, identified by Wang et al.(102) using Affymetrix U133A arrays on a set of 115 breast cancer patients, was developed to predict distant relapse rate in untreated node-negative patients. This classification algorithm was trained and optimized in ER-negative and ER-positive patients separately, on the basis of the assumption that the mechanisms driving these two types of cancers are distinct. The genes forming the signature are involved mainly in DNA replication and repair, cell cycle and apoptosis, and immune response. The prognostic value of this 76-gene signature has been validated on an independent set of 180 node-negative untreated patients from different institutions(103) and is also available for public access.   Analysis of the particular genes expressed in a primary breast tumor may also be able to classify the risks for site of distant metastasis. The 70-gene and intrinsic subtype signatures have been shown to be preserved between primary and metastatic sites, supporting the idea that metastatic capacity is inherent in the primary tumor and that metastases closely reflect the biology of the primary tumor(80). Gene signatures that predict breast cancer metastasis to bone(104) and lung(105) have been published. Predicting risk of bone metastasis, in particular, could guide imaging follow-up (e.g., need for bone scans) and/or use of adjuvant bisphosphonates(104).   Prognostic models suffer from inherent limitations. In general, different patient populations and statistical methods will yield different optimal prognostic signatures (106, 107). Supervised analyses optimized against outcome will necessarily be overfitted to the data set from which they are derived and will be expected to lose some prognostic  40 power in validation studies on other data sets. Gene lists need not be expected to overlap, as distinct genes may track similar biological processes(25). Finally, no matter how sophisticated and thorough a microarray analysis may be, there is a stochastic component to patient outcome that will prevent any prognostic model from being perfect---there is a certain unavoidable randomness to fate. Studies to Predict Tumor Response to Adjuvant Systemic Therapy  The major clinical goal in applying gene expression profiling to breast cancers is to guide more individualized treatment; conventional biomarkers other than ER and HER2 have not been able to help greatly in this regard. Thus, many studies have sought to use microarray expression profiling to go beyond prognostic signatures to develop predictors of drug response.   Expression profiling of cell cultures before and after drug treatment can be used to develop drug response predictors that may be applied to patient tumor specimens. Doxorubicin- and 5-fluorouracil-induced changes in gene expression seen in cell lines are also seen in tumor specimens from treated patients(108). Cells with luminal-type expression profiles responded by inducing cell cycle checkpoint genes not induced by cells with a basal-like expression profile, indicating that the intrinsic biologic subtype could influence tumor response to conventional chemotherapy. Using various chemotherapy-sensitive carcinoma cell lines as models, a 79-gene doxorubicin resistance signature (including the P-glycoprotein efflux pump) was identified using 43,000 spot cDNA microarrays(109). This signature could be identified in publicly available primary breast cancer specimens and correlated with shorter patient survival, although it has yet to  41 be validated in clinical material randomized to anthracycline versus nonanthracycline therapy.   Treatment of cell cultures with estrogens has also been used to identify an estrogen response signature, including PR, RERG, CTSD, and PDZ1 and proliferation- associated genes such as CCNB2, CCND1, MKI67, MYBL2, BIRC5 (survivin), and STK6 (110). When applied retrospectively to multiple independent breast cancer expression profile data sets, this 822-gene classifier predicts outcome among ER-positive and tamoxifen-treated patients.   In the absence of long-term follow-up data, complete pathological response is used as a surrogate end point for patient benefit in a neoadjuvant setting, and exploratory microarray analyses have been incorporated into some clinical trial designs. A study using Affymetrix HgU95-Av2 arrays to compare expression profiles before and after neoadjuvant docetaxel reported that sensitive tumors have higher expression of genes involved in cell cycle, cytoskeleton, adhesion and protein transport, and modification (111), whereas resistant tumors express the mammalian target of rapamycin (mTOR) survival pathway. This small study (N = 13) requires confirmation, and the implication that mTOR inhibitors may make an effective combination therapy needs experimental validation.   In contrast, Hanneman et al.(112) were unable to generate expression predictors of pathologic response in 24 doxorubicin-cyclophosphamide and 24 doxorubicin-  42 docetaxel neoadjuvant breast cancer patients (T > 3 cm and/or node positive) using 18K spot cDNA microarrays. A 31K cDNA microarray study of 42 patients (24 for classifier discovery and 18 for validation) yielded a 74-gene classifier to predict complete pathologic response to paclitaxel-fluorouracil-doxorubicin-cyclophosphamide (T/FAC) neoadjuvant chemotherapy(113). In this case, the training-validation split was a very appropriate design method, but left small numbers of events in each group, yielding validation results with borderline significance and low sensitivity (i.e., the study correctly predicted three of seven patients who did respond and eleven of eleven who did not).   Expression profiles from 37 locally advanced patients treated with neoadjuvant liposomal doxorubicin-paclitaxel chemotherapy were evaluated using U133 Plus 2.0 microarrays(114) and linked to outcome. No strong link was possible to clinical or primary site pathologic response in the primary tumor, although a gene list predicting nodal involvement at subsequent surgery was generated that had general prognostic value in univariate analyses in two other series.   For future studies, valuable, prospectively collected before-and-after neoadjuvant samples may best be used to validate preexisting hypotheses about gene predictors, rather than be sacrificed to deduce lists of novel predictor genes (an analysis that can and should still be performed secondarily from the data generated on such specimens). For example, in a study of 82 patients treated with neoadjuvant T/FAC profiled using U133A microarrays, pathologic complete response (pCR) was much higher in cases expressing the previously established basal-like and HER2-positive intrinsic subtype signatures  43 (both with 45% pCR) than in tumors with luminal (2 of 30 pCR) or normal-like (0 of 10 pCR) expression profiles(115). Exploratory gene lists correlating with pCR were then generated by supervised analyses and, interestingly, are completely different between the basal-like and HER2 types, suggesting that tumors with different intrinsic biology employ different mechanisms of drug resistance.   One unanswered question is whether molecular profiles outperform available clinical and pathological parameters. A 28K cDNA microarray study showed gene expression profiles perform as well as (but no better than) classifiers based on clinical parameters and the Nottingham prognostic index to predict odds of recurrences in a cohort of 85 premenopausal, lymph-node-positive breast cancer patients treated with adjuvant CMF (cyclophosphamide, methotrexate, and 5-fluorouracil)(116). This group of patients generally receives aggressive treatment, and the potential to identify any who could be spared chemotherapy would be clinically relevant.   ER-positive tumors are typically associated with better clinical outcomes and good response to hormonal therapies such as tamoxifen(117). However, a subset of patients recurs, and up to 40% develops resistance to tamoxifen(118). Results from randomized trials suggest that aromatase inhibitors can reduce the recurrence risk, especially at distant sites, for postmenopausal endocrine-responsive breast cancers(119- 122). Response to such endocrine therapies can be correlated with global gene expression patterns among ER-positive breast cancers.   44  Using 18K cDNA microarrays, Jansen et al.(123) profiled 10- to 20-year-old frozen primary tumor specimens from patients who received adjuvant tamoxifen. They identified an 81-gene signature (26% involving estrogen action, 14% apoptosis, 9% extracellular matrix formation, and 6% immune response), discriminating 21 patients with objective response from 25 with progressive disease. This list was further reduced to a 44-gene optimal predictive signature and validated on a separate set of 66 breast tumors, confirming significant predictive value for longer time to progression. The overall accuracy to classify tamoxifen resistance was 80%.   A similar study by Ma et al.(34) used 22K oligonucleotide microarrays to profile 60 frozen primary tumor samples from women treated with adjuvant tamoxifen, from which they distilled a two-gene ratio predictor for tamoxifen resistance. The genes, HOXB13 and IL17BR, did not have any previously characterized functional relevance to tamoxifen resistance. Initial validation was performed on 20 independent cases by qRT- PCR which can be readily applied to formalin-fixed tissue. Subsequent validation studies found this ratio to be predictive only among node-negative breast tumors(124), and independent groups have not been able to verify strong concordance with other expression-profile-derived predictors (intrinsic subtype, 70-gene Amsterdam, wound healing, and recurrence score signatures)(25).   Standard pathology blocks of formalin-fixed, paraffin-embedded tissue are more amenable to qRT-PCR (Figure 1.2) assay than full microarray profiling. Use of such material facilitates clinical translation and allows retrospective analysis of previously  45 collected large cohorts with available follow-up data. Building upon published microarray data sets including the intrinsic subtypes and 70-gene Amsterdam signatures, qRT-PCR tests for 250 genes (together effectively constituting a partial expression profile) were developed, and, following testing on three independent clinical series, an optimized 21-gene assay (16 discriminators and 5 internal controls) was developed to predict recurrence in ER-positive, node-negative patients treated with adjuvant tamoxifen. The resulting recurrence score gives differential weightings to the contributing genes (proliferation: MKI67, STK15, BIRC5/survivin, CCNB1, MYBL2; estrogen response: ER, PGR, SCUBE2; HER2 amplicon: ERBB2,GRB7; local invasion: MMP11, CTSL2; antiapoptotic: BCL2, BAG1; drug metabolism/antioxidant: GSTM1; macrophage response: CD68) on the basis of model building in the training sets, and was shown to give a quantitative assessment of the likelihood of distant recurrence in 668 cases of ER-positive and node-negative breast cancers treated by adjuvant tamoxifen in the National Surgical Adjuvant Breast and Bowel Project (NSABP) B14 protocol(12).   A later report evaluated the performance of this assay in 651 ER-positive and node-negative breast tumors, which as part of NSABP B20 had been randomized to tamoxifen or to tamoxifen plus chemotherapy(125). Patients with high recurrence scores had a large benefit from chemotherapy (0.26 relative risk and 27.6% mean absolute decrease in a 10-year distant recurrence rate), whereas the low recurrence score group (54% of patients) derived essentially no benefit. On the basis of these exciting results, this assay was promptly offered as a commercial test (Oncotype Dx!). Controversy remains, however, as the 651 patient set included 227 cases from the tamoxifen arm of NSABP B20, data which were used during the model development phase and then  46 reapplied to the larger combined data set, making the prediction of benefit from chemotherapy appear more impressive(126). As with other commercially available tests (such as MammaPrint), it also remains unclear whether it provides a cost-effective advance over standard clinicopathologic factors routinely being obtained on all patients.   Predictive assays are in high demand to drive clinical care decisions. Ultimately, proof of the clinical value of expression signatures will require prospective clinical trials, ideally with decision making randomized as to whether decisions are based on standard clinicopathological factors or on the information derived from expression profile signatures. Prospective trials assessing the 70-gene Amsterdam signature and 21-gene recurrence score assays are now underway to help address these issues (described further below). Studies to Identify Therapeutic Targets  Previous technologies used to analyze broad aspects of tumor biology, such as histomorphology and chromosomal karyotyping, do not directly provide information about drug sensitivity. Microarrays not only identify broad gene expression patterns important to tumor biology, but also link these patterns to specific genes that may be targeted by established or experimental drugs. The re-identification of ESR1 (encoding ER) and ERBB2 (encoding HER2) as central genes within identifiable subtypes of breast cancer, as defined by microarray expression profiling, serves as a validation of this technology as a means to identifying relevant drug targets(31). The basal-like molecular phenotype expresses neither ER nor HER2, but does characteristically overexpress epidermal growth factor receptor(32, 33, 56, 65), which is targeted by several new drugs  47 used in colorectal, head and neck, and lung carcinomas, one of which (erlotinib) showed activity in a breast cancer pilot trial(127).   An early study using spotted membrane-based arrays to screen 124 genes on 18 breast tumors highlighted ESR1 and HSP90 as top targets(128). Studies now use microarrays measuring in excess of 40,000 transcripts, in over 100 specimens at a time; one more recent study that combined expression profiles with genomic aCGH expression profile data highlighted ERBB2, FGFR1, IKBKB, PROCC, ADAM9, FNTA, ACACA, PNMT, and NR1D1 as top targets that are potentially druggable with available agents(67). Other studies have proposed matrix metalloproteinases(129) or NF-!B(130) as key targets in particular breast cancer histologic subtypes. One potentially exciting finding is that apparent estrogen independence and tamoxifen resistance may be conferred, in some cases, by upregulation of androgen receptor action(131, 132), suggesting there would be resistance to aromatase inhibitors but sensitivity to anti-androgens in these patients. Such findings highlight the ability of gene expression profiling to identify pathways that, although only relevant in a minority of breast cancer patients, can be targeted by existing drugs, facilitating the development of personalized medicine.   However, it is important to realize that gene expression, particularly in a microarray study, does not necessarily correlate with the identification of a useful therapeutic target (Table 1.2). Experimental validation, including typical in vitro drug sensitivity assays, is a necessary step even when a target of a well-established drug appears to be expressed in breast cancer specimens. Expression profiling is only a discovery-based screen, albeit a marvelously comprehensive one, for drug targets.  48 CHALLENGES AND FUTURE DIRECTIONS Potential Limitations of Breast-Cancer-Profiling Studies  The emergence of prognostic (associated with clinical outcome) and predictive (associated with response to therapy) gene expression signatures holds promise for attempts to individualize breast cancer treatment; however, some studies have raised concerns about the clinical applicability of these published signatures. One concern was that the 76-gene signature, identified by Wang et al.(102) to predict distinct metastasis in untreated node-negative patients of all age groups shared only three genes with the 70- gene signature developed by the Amsterdam group (van’t Veer, van de Vijver and colleagues to predict a five-year distant metastasis risk for node-negative patients younger than 55 years of age(26, 27). This cannot be explained simply by the use of different microarray platforms (the Affymetrix and Agilent platforms share large numbers of overlapping features, particularly for characterized genes important in cancer) nor by the partial differences in patient selection criteria.   Rather, it appears that in breast cancer, multiple distinct gene sets, derived through a variety of approaches and perhaps representing distinct biological processes, can result in prognostic models with a high degree of significance. Thus, although there may seem to be little consistency between studies in the actual gene lists, a recent report comparing five independent published breast cancer gene signatures(25) validated four of them as correlating equally well with survival when applied to one common primary microarray data set(26-28). The four gene signatures showing significant agreement in outcome predictions are the 70-gene Amsterdam signature, wound-healing response, 21-  49 gene qRT-PCR recurrence score, and intrinsic molecular subtype signatures. This important finding implies that, despite differences in gene lists, these separate signatures may contribute equally to the prognostic space in breast cancer(133). Only the two-gene ratio (HOXB13:IL17BR) failed to show prognostic value in this study, a result also found by others(134).   In a separate approach to devise more robust gene signatures, one study suggested that combining cross-platform microarray data sets(135, 136) into one single data set may yield gene signatures with a higher predictive power that may be more broadly generalized(137). This finding emphasizes, again, how valuable it is to make primary microarray raw data publicly accessible, to allow external validation, optimization, and meta-analysis studies. There can be a selection bias in extracting significant genes from among thousands of measurements(138), suggesting a need to estimate an error rate by cross-validation or bootstrapping during the gene-selection process. Using the same data set from which the 70-gene signature was originally derived(26), an independent group demonstrated that many predictive gene lists can be selected that correlate equally well with survival, but the correlation fluctuates when measured over different subsets of patients(100). This observation is shared by another study reporting that, using the same data set, the performance of molecular gene signatures can be unstable and depends upon the selection of patient subsets in creating the training sets(35).   After an initial burst of descriptive publications on breast cancer profiles, studies such as these have already become increasingly important, and elements of external validation and data accessibility are now obligatory for publication of gene expression  50 profiling studies in major journals. A recent study, designed to assess the value of a mathematical method named Probably Approximately Correct to quantify the stability of gene signatures, reported that gene expression profiles from thousands of breast cancers may be needed during the discovery phase to deduce a truly definitive predictive list of genes that can have an overlap with more than 50% in a second list of genes derived from similar specimens(139). Discovery-based expression profiling studies on this scale will be almost impossible to realize. Nevertheless, it should be kept in mind that the original 70-gene signature, like several other signatures mentioned above, has now been shown to be a significant prognosticator independent of traditional prognostic markers in a variety of retrospective validation sets.   Concerning the issue of technical consistency across platforms, the intrinsic subtypes model, for example, has been validated to show consistent class assignments across three microarray platforms---Applied Biosystems 60-mer oligonucleotide microarrays, Stanford cDNA microarrays, and Agilent 60-mer oligonucleotide microarrays(140)---and further validated against qRT-PCR for a minimal subset of discriminator genes. Furthermore, the basal subtype of carcinoma recognized using these expression profiling platforms has been validated using immunohistochemistry on TMAs (as further discussed below). Contribution of Noncancerous Elements to Breast Cancer Expression Profiles  Tumor specimens consist of a complex mixture of invading malignant epithelial cells and their surrounding stroma containing fibroblasts, myofibroblasts, endothelial cells, fat, mast cells and inflammatory cells. In addition, the samples used for gene array  51 studies may contain normal breast tissue or in situ carcinoma. A fundamental limitation of expression profiling is the correct attribution of the gene expression measurements to the cancer cells themselves. Although laborious, and therefore necessarily reducing sample throughput, this issue can be addressed by laser microdissection(60, 141, 142) or similar techniques. Allinen et al.(143) used a magnetic bead-based cell sorting coupled with serial analysis of gene expression to determine the transcriptional profile of stromal and epithelial cells in normal breast, in situ, and invasive breast carcinoma. At a minimum, frozen sections of the tumor tissue used for RNA isolation should be cut and examined by a pathologist to confirm that the sample is correctly diagnosed and representative of the tumor, as well as to determine the percentage of cancer cells.   To a certain extent, the signature contributions of stroma, normal epithelium, and inflammatory cells can be picked out by bioinformatics, but many genes may function differently in different cells and thus wholesale subtraction of such genes is probably inappropriate. In addition, the non-neoplastic elements adjacent to cancer have distinct, abnormal signatures(60) of their own. Stromal signatures in breast cancer can be matched to soft tissue tumor signatures(99): In a proof-of-principle study using expression profiles of two soft tissue tumors, it was shown that stroma in aggressive breast cancers resembles the solitary fibrous tumor signature (similar to epithelial support fibroblasts), whereas stroma in less-aggressive tumors express a fibromatosis-like signature [similar to scar and seen in normal breast stroma as well(60)].   Note that contributions by non-neoplastic stromal elements are still very important to overall tumor biology and clearly can relate to prognosis(144). However, if  52 the expression data are being mined for diagnostic markers or therapeutic targets, genes expressed in stromal cells will not be as disease specific. Although drugs targeting a stromal cell gene might be more likely to have side effects on normal tissues, such agents may also prove particularly effective inasmuch as the target cell lacks a cancer cell population’s capacity to rapidly acquire resistant mutations. Moreover, the stromal reaction signatures seen in breast carcinoma may be shared at least in part with other carcinoma types, and thus could lead to the development of drug targets that could be effective in a variety of epithelial malignancies.   Given the issues with ascribing gene expression profiles derived from a complex mixture of cells to individual cell types within the specimen, TMA technology(145) has become a valuable tool for high-throughput validation of expression profiles with morphologic correlation(146).   TMAs are constructed by transferring small cores of paraffin-embedded tissue samples to a recipient block, which allows the assessment of one biomarker’s expression across hundreds of patients in one experiment. This has the advantage of allowing access to large banks of existing tumor samples independent of those used for expression profiling, which may already be linkable to mature patient outcome data. The main limitation is the capacity to validate only one gene at a time, and TMA-based analyses are really limited to techniques applicable on formalin-fixed, paraffin-embedded tissue (primarily protein-level analysis by immunohistochemistry, and mRNA and DNA analysis by in situ hybridization).  53  Using TMAs linked to clinical outcome, surrogate immunohistochemical panels can be developed in an attempt to recapitulate the biological subgroupings of breast cancer derived from full gene expression profiles. The basal-like intrinsic subtype can be identified using a panel of established immunostains recognizing ER, PR, HER2, epidermal growth factor receptor, and cytokeratin 5/6 (Figure 3) (65). This panel remained prognostic on an independent cohort of patients, and by allowing assessment of standard pathology blocks, the basal-like breast cancer subtype was especially prevalent among premenopausal African American patients(147). Novel prognostic immunostaining panels can also be identified by using TMAs(148). Gene Predictors as Diagnostic Tests for Clinical Use  The 21-gene assay is commercially available under the name Oncotype Dx. This test requires sending paraffin-embedded tissue to the U.S.-based testing laboratory for qRT-PCR analysis, and is being evaluated prospectively in a large clinical trial named the Trial Assigning IndividuaLized Options for treatment (TAILORx). Sponsored by the National Cancer Institute and in collaboration with the Eastern Cooperative Oncology Group, TAILORx opened in May 2006 and is recruiting more than 10,000 breast cancer patients diagnosed with ER- or PR-positive, HER2-negative breast cancers in 900 sites across the United States. Each tumor will have a recurrence score determined by the multigene qRT-PCR assay, and those with moderate recurrence scores will be randomized to receive adjuvant hormonal therapy with or without chemotherapy. TAILORx is designed to evaluate if the patients with moderate recurrence score benefit from adjuvant chemotherapy and whether the 21-gene recurrence score test can help in treatment decision plans for these patients.  54  The 70-gene signature is now available as a custom-designed, condensed microarray chip known as MammaPrint(7), and has recently obtained Food and Drug Administration approval. Testing requires a sample of fresh tumor tissue to be sent to the company’s laboratory in The Netherlands. The 70-gene signature so obtained has been validated, in historical cohort analyses, to predict early distant relapse(149); to provide additional prognostic information beyond what can be determined from patient age and tumor grade, size, and ER status(149); and to perform at least equally well as outcome probabilities derived from Adjuvant! (www.adjuvantonline.com) (150, 151).   A large collaborative prospective trial named Microarray In Node-negative Disease may Avoid Chemotherapy Trial (MINDACT) opened in July 2006 and is currently recruiting 6000 node-negative early-stage breast cancer patients to examine the benefit and risk of chemotherapy among patients having discordant risk assessments from the 70-gene signature versus the clinicopathological risk factors based on Adjuvant! This trial is conducted by the Breast International Group and coordinated by the European Organisation for Research and Treatment of Cancer. The objective of MINDACT is to assess prospectively whether 10%--15% of low-risk breast cancer patients can be spared from adjuvant chemotherapy, on the basis of results obtained with the MammaPrint microarray(152). As this expression signature, which requires fresh tissue, could not first be applied to retrospective studies with treatment and outcome information, this is a necessary, bold step to carry such a gene expression signature forward into a phase III trial. Unlike in the TAILORx trial, wherein all patients are to receive the molecular test, patients in MINDACT will be randomized as to whether they receive the test or not,  55 providing a particularly rigorous assessment as to whether the expression profile improves treatment decisions over current gold standard methods.   In closing, gene expression profiling analysis is a fast-moving field, with new and improved platforms and data analysis methods coming out on almost a monthly basis. General principles that have emerged include the value of public access to primary data and the need for external validation studies, lessons that should also be applied to other high-throughput techniques such as array-based comparative genomic hybridization and proteomics studies. Multiple gene expression signatures relating to breast cancer biology, diagnosis, prognosis, and prediction have been described, and the technique has excellent potential as a discovery tool for new therapies. With some modifications, this technology is being applied to clinical specimens, and commercially available tests have been released. Such tests are being assessed in prospective trials, whose findings will not be available for several years. Nevertheless, gene expression profiling studies of breast cancer are to a large extent more advanced than those in other solid tumor types and are providing important information about how expression profiling technology can be used to help patients with cancer.  56 Table 1.1 List of selected databases with publicly available breast cancer microarray data  Public Web-based database for breast cancer microarray data URL Organization Description ArrayExpress http://www.ebi.ac.uk/arrayexpress/ European Bioinformatics Institute (EBI) Public data deposition and queries GEO, Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo/ National Center for Biotechnology Information (NCBI) Public data deposition and queries ONCOMINE, Cancer Profiling Database http://www.oncomine.org/main/index.jsp University of Michigan Public queries PUMAdb, Princeton University MicroArray database http://puma.princeton.edu/ Princeton University Public queries SMD, Stanford Microarray database http://genome-www5.stanford.edu/ Stanford University Public queries UNC-Chapel Hill Microarray database https://genome.unc.edu/ University of North Caroline at Chapel Hill Public queries     57 Table 1.2 Upregulated genes in a cancer expression profile  Reason upregulated Example Implication for targeted therapy Fundamental to tumor oncogenesis ERBB2, ESR1 Excellent, druggable targets Secondarily-activated gene encoding, e.g., structural protein CK5, CK17 Good diagnostic marker but may not be a useful target Tertiary changes well downstream from primary event Probably most genes Less specific to tumor, more side effects Compensatory change for oncogenesis CDKN Targeting this may worsen disease Expressed by infiltrating normal cells CD4 Not specific to tumor, would have side effects Untranslated transcript micro RNAs, some ESTs (expressed sequence tags) Would require new nucleic-acid- based drugs False discovery Could apply to any gene Targets need validation (e.g., by analyzing TMAs)  58 Figure 1.1 Publicly accessible primary microarray data. (a) Raw image data from Reference 31. The detail shows a magnified view of one sector of the array, with 576 hybridized cDNA spots. In total, the array consists of 9216 elements (more recent spotted arrays are higher density, with >42,000 elements). (b) GeneXPlorer allows the user to explore cluster diagrams interactively; in this view, genes whose expression is most closely correlated with ERBB2 are shown, including the neighboring GRB7, which is co- amplified with ERBB2 (left, cell lines; right, breast cancer specimens. Figure data from Reference 31 and available at http://genome-www5.stanford.edu.   59 Figure 1.2 Expression profiling by quantitative real-time polymerase chain reaction (qRT-PCR). (a) Formalin-fixed, paraffin-embedded (FFPE) breast tumor tissue blocks are either sectioned (scroll method) or cored (guided by histology) to obtain a sample dominated by invasive breast cancer tissue. (b) Tissue is deparaffinized, followed by total RNA purification and genomic DNA removal. Approximately 0.5 µg from each sample is enough for 200 qRT-PCR runs. (c) The RNA template is split (e.g., on 96- or 384-well plates) for separate qRT-PCR reactions designed to measure individual genes of interest. In each well, mRNA transcripts are reverse transcribed using gene-specific primers into cDNA. (d) The purified cDNA template is mixed with optimized gene-specific forward and reverse primers for high-throughput SYBR Green-type qRT-PCR. (e) Expression of each gene is quantified from the incorporation curve. (f) Measurements are normalized (e.g., using housekeeper gene controls) and can then be plugged into any of the statistical analysis methods for microarray expression profiles. Unsupervised hierarchical clustering on 26 genes splits 18 breast tumor samples into basal-like and luminal breast cancer subtypes.     60 Figure 1.3 Immunopanel of Breast Cancer Subtypes. Immunohistochemical surrogate panel derived from gene expression profile data allows molecular subtyping of breast cancers on tissue microarrays.    61 References:  1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995;270(5235):467-70. 2. Shalon D, Smith SJ, Brown PO. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Res 1996;6(7):639-45. 3. Dupuy A, Simon RM. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 2007;99(2):147-57. 4. Quackenbush J. Microarray analysis and tumor classification. N Engl J Med 2006;354(23):2463-72. 5. Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996;14(13):1675-80. 6. Naderi A, Ahmed AA, Barbosa-Morais NL, Aparicio S, Brenton JD, Caldas C. Expression microarray reproducibility is improved by optimising purification steps in RNA amplification and labelling. BMC Genomics 2004;5(1):9. 7. Glas AM, Floore A, Delahaye LJ, Witteveen AT, Pover RC, Bakx N, et al. Converting a breast cancer microarray signature into a high-throughput diagnostic test. BMC Genomics 2006;7:278. 8. Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Pergamenschikov A, Williams CF, et al. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat Genet 1999;23(1):41-6. 9. Coe BP, Ylstra B, Carvalho B, Meijer GA, Macaulay C, Lam WL. Resolving the resolution of array CGH. Genomics 2007;89(5):647-53. 10. Perreard L, Fan C, Quackenbush JF, Mullins M, Gauthier NP, Nelson E, et al. Classification and risk stratification of invasive breast carcinomas using a real-time quantitative RT-PCR assay. Breast Cancer Res 2006;8(2):R23. 11. Szabo A, Perou CM, Karaca M, Perreard L, Quackenbush JF, Bernard PS. Statistical modeling for selecting housekeeper genes. Genome Biol 2004;5(8):R59. 12. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 2004;351(27):2817-26. 13. Dobbin KK, Kawasaki ES, Petersen DW, Simon RM. Characterizing dye bias in microarray experiments. Bioinformatics 2005;21(10):2430-7. 14. Holloway AJ, van Laar RK, Tothill RW, Bowtell DD. Options available--from start to finish--for obtaining data from DNA microarrays II. Nat Genet 2002;32 Suppl:481-9. 15. Quackenbush J. Microarray data normalization and transformation. Nat Genet 2002;32 Suppl:496-501. 16. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002;30(4):e15.  62 17. Goryachev AB, Macgregor PF, Edwards AM. Unfolding of microarray data. J Comput Biol 2001;8(4):443-61. 18. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003;4(2):249-64. 19. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003;31(4):e15. 20. Schadt EE, Li C, Su C, Wong WH. Analyzing high-density oligonucleotide gene expression array data. J Cell Biochem 2000;80(2):192-202. 21. Li C, Wong WH. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A 2001;98(1):31-6. 22. Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, Kohane IS. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 2002;18(3):405-12. 23. Jarvinen AK, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi OP, et al. Are data from different gene expression microarray platforms comparable? Genomics 2004;83(6):1164-8. 24. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 2006;24(9):1151-61. 25. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, et al. Concordance among gene-expression-based predictors for breast cancer. N Engl J Med 2006;355(6):560-9. 26. van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002;415(6871):530-6. 27. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 2002;347(25):1999-2009. 28. Chang HY, Nuyten DS, Sneddon JB, Hastie T, Tibshirani R, Sorlie T, et al. Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci U S A 2005;102(10):3738-43. 29. Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, Montgomery K, et al. Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol 2004;2(2):E7. 30. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, et al. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 2006;7:96. 31. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 32. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001;98(19):10869-74. 33. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 2003;100(14):8418-23.  63 34. Ma XJ, Wang Z, Ryan PD, Isakoff SJ, Barmettler A, Fuller A, et al. A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell 2004;5(6):607-16. 35. Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 2005;365(9458):488-92. 36. Ransohoff DF. Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev Cancer 2004;4(4):309-14. 37. Datta S, Datta S. Evaluation of clustering algorithms for gene expression data. BMC Bioinformatics 2006;7 Suppl 4:S17. 38. Datta S, Datta S. Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics 2006;7:397. 39. Kaufman L, Rousseeuw P. Finding Groups in Data. An introduction to Cluster Analysis: Wiley; 1990. 40. Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 2002;3(7):RESEARCH0036. 41. Jobson J. Applied Multivariate Data Analysis: Categorical and Multivariate Methods. New York: Springer; 1992. 42. Hartigan J. Clustering Algorithms. New York: Wiley; 1975. 43. Gordon AE. Classification: Methods for the Exploratory Analysis of Multivariate Data. New York: Chapman & Hall; 1981. 44. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York, NY: Springer-Verlag New York, Inc.; 2001. 45. Datta S. Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 2003;19(4):459-66. 46. Hartigan JA, Wong MA. A k-means clustering algorithm. Applied Statistics 1979;28:100-108. 47. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 1998;95(25):14863-8. 48. Saldanha AJ. Java Treeview--extensible visualization of microarray data. Bioinformatics 2004;20(17):3246-8. 49. Storey J. A direct approach to false discovery rates. Jour. Royal Stat. Soc. B 2002 2002;64:479-498. 50. Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2004;2(4):E108. 51. Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A 1999;96(6):2907- 12. 52. Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A 2000;97(1):262-7. 53. Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 2001;7(6):673-9.  64 54. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 2001;98(9):5116-21. 55. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome- wide expression profiles. Proc Natl Acad Sci U S A 2005;102(43):15545-50. 56. Perou CM, Jeffrey SS, van de Rijn M, Rees CA, Eisen MB, Ross DT, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A 1999;96(16):9212-7. 57. Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P, et al. Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 2000;24(3):227-35. 58. Zajchowski DA, Bartholdi MF, Gong Y, Webster L, Liu HL, Munishkin A, et al. Identification of gene expression profiles that predict the aggressive behavior of breast cancer cells. Cancer Res 2001;61(13):5168-78. 59. Calza S, Hall P, Auer G, Bjohle J, Klaar S, Kronenwett U, et al. Intrinsic molecular signature of breast cancer in a population-based cohort of 412 patients. Breast Cancer Res 2006;8(4):R34. 60. Finak G, Sadekova S, Pepin F, Hallett M, Meterissian S, Halwani F, et al. Gene expression signatures of morphologically normal breast tissue identify basal-like tumors. Breast Cancer Res 2006;8(5):R58. 61. Kapp AV, Jeffrey SS, Langerod A, Borresen-Dale AL, Han W, Noh DY, et al. Discovery and validation of breast cancer subtypes. BMC Genomics 2006;7:231. 62. Sotiriou C, Neo SY, McShane LM, Korn EL, Long PM, Jazaeri A, et al. Breast cancer classification and prognosis based on gene expression profiles from a population- based study. Proc Natl Acad Sci U S A 2003;100(18):10393-8. 63. Yehiely F, Moyano JV, Evans JR, Nielsen TO, Cryns VL. Deconstructing the molecular portrait of basal-like breast cancer. Trends Mol Med 2006;12(11):537-44. 64. van de Rijn M, Perou CM, Tibshirani R, Haas P, Kallioniemi O, Kononen J, et al. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am J Pathol 2002;161(6):1991-6. 65. Nielsen TO, Hsu FD, Jensen K, Cheang M, Karaca G, Hu Z, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res 2004;10(16):5367-74. 66. Rakha EA, Putti TC, Abd El-Rehim DM, Paish C, Green AR, Powe DG, et al. Morphological and immunophenotypic analysis of breast carcinomas with basal and myoepithelial differentiation. J Pathol 2006;208(4):495-506. 67. Chin K, DeVries S, Fridlyand J, Spellman PT, Roydasgupta R, Kuo WL, et al. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 2006;10(6):529-41. 68. Dvorak HF. Tumors: wounds that do not heal. Similarities between tumor stroma generation and wound healing. N Engl J Med 1986;315(26):1650-9. 69. Chi JT, Wang Z, Nuyten DS, Rodriguez EH, Schaner ME, Salim A, et al. Gene expression programs in response to hypoxia: cell type specificity and prognostic significance in human cancers. PLoS Med 2006;3(3):e47.  65 70. Liu R, Wang X, Chen GY, Dalerba P, Gurney A, Hoey T, et al. The prognostic role of a gene signature from tumorigenic breast-cancer cells. N Engl J Med 2007;356(3):217-26. 71. Al-Hajj M, Wicha MS, Benito-Hernandez A, Morrison SJ, Clarke MF. Prospective identification of tumorigenic breast cancer cells. Proc Natl Acad Sci U S A 2003;100(7):3983-8. 72. Reya T, Morrison SJ, Clarke MF, Weissman IL. Stem cells, cancer, and cancer stem cells. Nature 2001;414(6859):105-11. 73. Singh SK, Hawkins C, Clarke ID, Squire JA, Bayani J, Hide T, et al. Identification of human brain tumour initiating cells. Nature 2004;432(7015):396-401. 74. Shipitsin M, Campbell LL, Argani P, Weremowicz S, Bloushtain-Qimron N, Yao J, et al. Molecular definition of breast tumor heterogeneity. Cancer Cell 2007;11(3):259- 73. 75. Song LL, Miele L. Cancer stem cells--an old idea that's new again: implications for the diagnosis and treatment of breast cancer. Expert Opin Biol Ther 2007;7(4):431-8. 76. Fidler IJ. The pathogenesis of cancer metastasis: the 'seed and soil' hypothesis revisited. Nat Rev Cancer 2003;3(6):453-8. 77. Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, et al. Gene- expression profiles in hereditary breast cancer. N Engl J Med 2001;344(8):539-48. 78. Berns EM, van Staveren IL, Verhoog L, van de Ouweland AM, Meijer-van Gelder M, Meijers-Heijboer H, et al. Molecular profiles of BRCA1-mutated and matched sporadic breast tumours: relation with clinico-pathological features. Br J Cancer 2001;85(4):538-45. 79. Hedenfalk I, Ringner M, Ben-Dor A, Yakhini Z, Chen Y, Chebil G, et al. Molecular classification of familial non-BRCA1/BRCA2 breast cancer. Proc Natl Acad Sci U S A 2003;100(5):2532-7. 80. Weigelt B, Glas AM, Wessels LF, Witteveen AT, Peterse JL, van't Veer LJ. Gene expression profiles of primary breast tumors maintained in distant metastases. Proc Natl Acad Sci U S A 2003;100(26):15901-5. 81. Berns EM, Foekens JA, Vossen R, Look MP, Devilee P, Henzen-Logmans SC, et al. Complete sequencing of TP53 predicts poor response to systemic therapy of advanced breast cancer. Cancer Res 2000;60(8):2155-62. 82. Borresen-Dale AL. TP53 and breast cancer. Hum Mutat 2003;21(3):292-300. 83. Geisler S, Lonning PE, Aas T, Johnsen H, Fluge O, Haugen DF, et al. Influence of TP53 gene alterations and c-erbB-2 expression on the response to treatment with doxorubicin in locally advanced breast cancer. Cancer Res 2001;61(6):2505-12. 84. Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, et al. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci U S A 2005;102(38):13550-5. 85. Cleton-Jansen AM. E-cadherin and loss of heterozygosity at chromosome 16 in breast carcinogenesis: different genetic pathways in ductal and lobular breast cancer? Breast Cancer Res 2002;4(1):5-8. 86. Coradini D, Pellizzaro C, Veneroni S, Ventura L, Daidone MG. Infiltrating ductal and lobular breast carcinomas are characterised by different interrelationships among  66 markers related to angiogenesis and hormone dependence. Br J Cancer 2002;87(10):1105-11. 87. Korkola JE, DeVries S, Fridlyand J, Hwang ES, Estep AL, Chen YY, et al. Differentiation of lobular versus ductal breast carcinomas by expression microarray analysis. Cancer Res 2003;63(21):7167-75. 88. Zhao H, Langerod A, Ji Y, Nowels KW, Nesland JM, Tibshirani R, et al. Different gene expression patterns in invasive lobular and ductal carcinomas of the breast. Mol Biol Cell 2004;15(6):2523-36. 89. Bertucci F, Finetti P, Rougemont J, Charafe-Jauffret E, Cervera N, Tarpin C, et al. Gene expression profiling identifies molecular subtypes of inflammatory breast cancer. Cancer Res 2005;65(6):2170-8. 90. Bertucci F, Finetti P, Rougemont J, Charafe-Jauffret E, Nasser V, Loriod B, et al. Gene expression profiling for molecular characterization of inflammatory breast cancer and prediction of response to chemotherapy. Cancer Res 2004;64(23):8558-65. 91. Hannemann J, Velds A, Halfwerk JB, Kreike B, Peterse JL, van de Vijver MJ. Classification of ductal carcinoma in situ by gene expression profiling. Breast Cancer Res 2006;8(5):R61. 92. Kote-Jarai Z, Matthews L, Osorio A, Shanley S, Giddings I, Moreews F, et al. Accurate prediction of BRCA1 and BRCA2 heterozygous genotype using expression profiling after induced DNA damage. Clin Cancer Res 2006;12(13):3896-901. 93. Robbins P, Pinder S, de Klerk N, Dawkins H, Harvey J, Sterrett G, et al. Histological grading of breast carcinomas: a study of interobserver agreement. Hum Pathol 1995;26(8):873-9. 94. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 2006;98(4):262-72. 95. Ivshina AV, George J, Senko O, Mow B, Putti TC, Smeds J, et al. Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res 2006;66(21):10292-301. 96. Sotiriou C, Wirapati P, Loi S, Desmedt C, Durbecq V, Harris A, et al. Better characterization of estrogen receptor (ER) positive luminal subtypes using genomic grade. Breast cancer Research and Treatment 2005;94(Supplement 1):S19. 97. Dai H, van't Veer L, Lamb J, He YD, Mao M, Fine BM, et al. A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. Cancer Res 2005;65(10):4059-66. 98. Eden P, Ritz C, Rose C, Ferno M, Peterson C. "Good Old" clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur J Cancer 2004;40(12):1837-41. 99. West RB, Nuyten DS, Subramanian S, Nielsen TO, Corless CL, Rubin BP, et al. Determination of stromal signatures in breast carcinoma. PLoS Biol 2005;3(6):e187. 100. Ein-Dor L, Kela I, Getz G, Givol D, Domany E. Outcome signature genes in breast cancer: is there a unique set? Bioinformatics 2005;21(2):171-8. 101. Thomassen M, Tan Q, Eiriksdottir F, Bak M, Cold S, Kruse TA. Prediction of metastasis from low-malignant breast cancer by gene expression profiling. Int J Cancer 2007;120(5):1070-5.  67 102. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, et al. Gene- expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005;365(9460):671-9. 103. Foekens JA, Atkins D, Zhang Y, Sweep FC, Harbeck N, Paradiso A, et al. Multicenter validation of a gene expression-based prognostic signature in lymph node- negative primary breast cancer. J Clin Oncol 2006;24(11):1665-71. 104. Smid M, Wang Y, Klijn JG, Sieuwerts AM, Zhang Y, Atkins D, et al. Genes associated with breast cancer metastatic to bone. J Clin Oncol 2006;24(15):2261-7. 105. Minn AJ, Gupta GP, Siegel PM, Bos PD, Shu W, Giri DD, et al. Genes that mediate breast cancer metastasis to lung. Nature 2005;436(7050):518-24. 106. Naderi A, Teschendorff AE, Barbosa-Morais NL, Pinder SE, Green AR, Powe DG, et al. A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene 2006. 107. Huang E, Cheng SH, Dressman H, Pittman J, Tsou MH, Horng CF, et al. Gene expression predictors of breast cancer outcomes. Lancet 2003;361(9369):1590-6. 108. Troester MA, Hoadley KA, Sorlie T, Herbert BS, Borresen-Dale AL, Lonning PE, et al. Cell-type-specific responses to chemotherapeutics in breast cancer. Cancer Res 2004;64(12):4218-26. 109. Gyorffy B, Serra V, Jurchott K, Abdul-Ghani R, Garber M, Stein U, et al. Prediction of doxorubicin sensitivity in breast tumors based on gene expression profiles of drug-resistant cell lines correlates with patient survival. Oncogene 2005;24(51):7542- 51. 110. Oh DS, Troester MA, Usary J, Hu Z, He X, Fan C, et al. Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol 2006;24(11):1656-64. 111. Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Tham YL, et al. Patterns of resistance and incomplete response to docetaxel by gene expression profiling in breast cancer patients. J Clin Oncol 2005;23(6):1169-77. 112. Hannemann J, Oosterkamp HM, Bosch CA, Velds A, Wessels LF, Loo C, et al. Changes in gene expression associated with response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol 2005;23(15):3331-42. 113. Ayers M, Symmans WF, Stec J, Damokosh AI, Clark E, Hess K, et al. Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer. J Clin Oncol 2004;22(12):2284-93. 114. Dressman HK, Hans C, Bild A, Olson JA, Rosen E, Marcom PK, et al. Gene expression profiles of multiple breast cancer phenotypes and response to neoadjuvant chemotherapy. Clin Cancer Res 2006;12(3 Pt 1):819-26. 115. Rouzier R, Perou CM, Symmans WF, Ibrahim N, Cristofanilli M, Anderson K, et al. Breast cancer molecular subtypes respond differently to preoperative chemotherapy. Clin Cancer Res 2005;11(16):5678-85. 116. Nimeus-Malmstrom E, Ritz C, Eden P, Johnsson A, Ohlsson M, Strand C, et al. Gene expression profilers and conventional clinical markers to predict distant recurrences for premenopausal breast cancer patients after adjuvant chemotherapy. Eur J Cancer 2006;42(16):2729-37.  68 117. Osborne CK. Tamoxifen in the treatment of breast cancer. N Engl J Med 1998;339(22):1609-18. 118. Clarke R, Liu MC, Bouker KB, Gu Z, Lee RY, Zhu Y, et al. Antiestrogen resistance in breast cancer and the role of estrogen receptor signaling. Oncogene 2003;22(47):7316-39. 119. Boccardo F, Rubagotti A, Puntoni M, Guglielmini P, Amoroso D, Fini A, et al. Switching to anastrozole versus continued tamoxifen treatment of early breast cancer: preliminary results of the Italian Tamoxifen Anastrozole Trial. J Clin Oncol 2005;23(22):5138-47. 120. Jakesz R, Jonat W, Gnant M, Mittlboeck M, Greil R, Tausch C, et al. Switching of postmenopausal women with endocrine-responsive early breast cancer to anastrozole after 2 years' adjuvant tamoxifen: combined results of ABCSG trial 8 and ARNO 95 trial. Lancet 2005;366(9484):455-62. 121. Thurlimann B, Keshaviah A, Coates AS, Mouridsen H, Mauriac L, Forbes JF, et al. A comparison of letrozole and tamoxifen in postmenopausal women with early breast cancer. N Engl J Med 2005;353(26):2747-57. 122. Buzdar AU, Guastalla JP, Nabholtz JM, Cuzick J, Group AT. Impact of chemotherapy regimens prior to endocrine therapy: Results from the ATAC (Anastrozole and Tamoxifen, Alone or in Combination) trial. Cancer 2006;107(3):472-80. 123. Jansen MP, Foekens JA, van Staveren IL, Dirkzwager-Kiel MM, Ritstier K, Look MP, et al. Molecular classification of tamoxifen-resistant breast carcinomas by gene expression profiling. J Clin Oncol 2005;23(4):732-40. 124. Goetz MP, Suman VJ, Ingle JN, Nibbe AM, Visscher DW, Reynolds CA, et al. A two-gene expression ratio of homeobox 13 and interleukin-17B receptor for prediction of recurrence and survival in women receiving adjuvant tamoxifen. Clin Cancer Res 2006;12(7 Pt 1):2080-7. 125. Paik S, Tang G, Shak S, Kim C, Baker J, Kim W, et al. Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 2006;24(23):3726-34. 126. Ioannidis JP. Gene expression profiling for individualized breast cancer chemotherapy: success or not? Nat Clin Pract Oncol 2006;3(10):538-9. 127. Yang SX, Simon RM, Tan AR, Nguyen D, Swain SM. Gene expression patterns and profile changes pre- and post-erlotinib treatment in patients with metastatic breast cancer. Clin Cancer Res 2005;11(17):6226-32. 128. Martin KJ, Kritzman BM, Price LM, Koh B, Kwan CP, Zhang X, et al. Linking gene expression patterns to therapeutic groups in breast cancer. Cancer Res 2000;60(8):2232-8. 129. Pusztai L, Sotiriou C, Buchholz TA, Meric F, Symmans WF, Esteva FJ, et al. Molecular profiles of invasive mucinous and ductal carcinomas of the breast: a molecular case study. Cancer Genet Cytogenet 2003;141(2):148-53. 130. Van Laere S, Van der Auwera I, Van den Eynden GG, Fox SB, Bianchi F, Harris AL, et al. Distinct molecular signature of inflammatory breast cancer by cDNA microarray analysis. Breast Cancer Res Treat 2005;93(3):237-46. 131. Becker M, Sommer A, Kratzschmar JR, Seidel H, Pohlenz HD, Fichtner I. Distinct gene expression patterns in a tamoxifen-sensitive human mammary carcinoma  69 xenograft and its tamoxifen-resistant subline MaCa 3366/TAM. Mol Cancer Ther 2005;4(1):151-68. 132. Doane AS, Danso M, Lal P, Donaton M, Zhang L, Hudis C, et al. An estrogen receptor-negative breast cancer subset characterized by a hormonally regulated transcriptional program and response to androgen. Oncogene 2006;25(28):3994-4008. 133. Massague J. Sorting out breast-cancer gene signatures. N Engl J Med 2007;356(3):294-7. 134. Reid JF, Lusa L, De Cecco L, Coradini D, Veneroni S, Daidone MG, et al. Limits of predictive models using microarray data for breast cancer clinical treatment outcome. J Natl Cancer Inst 2005;97(12):927-30. 135. Gruvberger S, Ringner M, Chen Y, Panavally S, Saal LH, Borg A, et al. Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 2001;61(16):5979-84. 136. West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, et al. Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci U S A 2001;98(20):11462-7. 137. Warnat P, Eils R, Brors B. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC Bioinformatics 2005;6:265. 138. Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci U S A 2002;99(10):6562-6. 139. Ein-Dor L, Zuk O, Domany E. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc Natl Acad Sci U S A 2006;103(15):5923-8. 140. Sorlie T, Wang Y, Xiao C, Johnsen H, Naume B, Samaha RR, et al. Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms. BMC Genomics 2006;7:127. 141. Zhu G, Reynolds L, Crnogorac-Jurcevic T, Gillett CE, Dublin EA, Marshall JF, et al. Combination of microdissection and microarray analysis to identify gene expression changes between differentially located tumour cells in breast cancer. Oncogene 2003;22(24):3742-8. 142. Fuller AP, Palmer-Toy D, Erlander MG, Sgroi DC. Laser capture microdissection and advanced molecular analysis of human breast cancer. J Mammary Gland Biol Neoplasia 2003;8(3):335-45. 143. Allinen M, Beroukhim R, Cai L, Brennan C, Lahti-Domenici J, Huang H, et al. Molecular characterization of the tumor microenvironment in breast cancer. Cancer Cell 2004;6(1):17-32. 144. Nelson CM, Bissell MJ. Of extracellular matrix, scaffolds, and signaling: tissue architecture regulates development, homeostasis, and cancer. Annu Rev Cell Dev Biol 2006;22:287-309. 145. Kononen J, Bubendorf L, Kallioniemi A, Barlund M, Schraml P, Leighton S, et al. Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat Med 1998;4(7):844-7. 146. van de Rijn M, Gilks CB. Applications of microarrays to histopathology. Histopathology 2004;44(2):97-108.  70 147. Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, Conway K, et al. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 2006;295(21):2492-502. 148. Ring BZ, Seitz RS, Beck R, Shasteen WJ, Tarr SM, Cheang MC, et al. Novel prognostic immunohistochemical biomarker panel for estrogen receptor-positive breast cancer. J Clin Oncol 2006;24(19):3039-47. 149. Buyse M, Loi S, van't Veer L, Viale G, Delorenzi M, Glas AM, et al. Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 2006;98(17):1183-92. 150. Ravdin PM, Siminoff LA, Davis GJ, Mercer MB, Hewlett J, Gerson N, et al. Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer. J Clin Oncol 2001;19(4):980-91. 151. Olivotto IA, Bajdik CD, Ravdin PM, Speers CH, Coldman AJ, Norris BD, et al. Population-based validation of the prognostic model ADJUVANT! for early breast cancer. J Clin Oncol 2005;23(12):2716-25. 152. Bogaerts J, Cardoso F, Buyse M, Braga S, Loi S, Harrison JA, et al. Gene signature evaluation as a prognostic tool: challenges in the design of the MINDACT trial. Nat Clin Pract Oncol 2006;3(10):540-51.    71    CHAPTER THREE  IMMUNOHISTOCHEMICAL DETECTION USING THE NEW RABBIT MONOCLONAL ANTIBODY SP1 OF ESTROGEN RECEPTOR IN BREAST CANCER IS SUPERIOR TO MOUSE MONOCLONAL ANTIBODY 1D5 IN PREDICTING SURVIVAL 2     A majority of breast cancers are estrogen receptor (ER) positive and this predicts for benefit from hormonal therapies such as tamoxifen. Therefore, ER testing becomes essential for each invasive breast cancer at diagnosis. Until the early 1990s, dextran- coated charcoal (DCC) assay was the most widely used clinical method for ER assessment. This methodology provides a quantitative measurement, and a cut off value of 10fmol/mg protein is most frequently used to define the positivity and negativity of ER. Although there is a good linear correlation between higher ER expression levels measured by DCC with improved survival, this assay is labor intensive. The ER measurement may also be underestimated dependent on the amount of tumor cellularity present.   2  A version of this manuscript was published. Cheang MCU, Treaba DO, Speers CH, Olivotto IA, Bajdik CD, Chia SK, Goldstein LC, Gelmon KA, Huntsman D, Gilks CB, Nielsen TO, Gown AM. Immunohistochemical detection using the new rabbit monoclonal antibody SP1 of estrogen receptor in breast cancer is superior to mouse monoclonal antibody 1D5 in predicting survival. Journal of Clinical Oncology 2006;24(36):5637-44.  72  The use of monoclonal immunohistochemical anti-ER antibodies has largely replaced DCC worldwide for clinical testing since ER was shown to be expressed in the tumor nuclei in standard histologic sections. The choice of commercial available antibodies can be subjective, based on experience and common practice rather than detailed formal assessment of sensitivity, specificity and predictive capacity. Therefore, we carried out a head-to-head comparison between two best available monoclonal anti- ER antibodies (SP1 and 1D5) against patient outcome and the ER values determined by DCC methodology. This work is the first and largest study to validate the quality of ER measurements by IHC. Since ER is one of the key biomarkers to define breast cancer subtypes, it is exceptionally important to use the best anti-ER antibody. False-negative or false-positive ER tumors may impact the determination of candidate biomarkers to differentiate the ER+/Luminal A and ER+/Luminal B subtypes. Reprinted with permission of Journal of Clinical Oncology, American Society of Clinical Oncology.    73 INTRODUCTION  Estrogen receptor (ER) may be the best example of a tumor biomarker whose assay drives therapeutic decision-making(1-4). In the 1990s, immunohistochemical (IHC) evaluation of ER largely supplanted dextran charcoal ligand-binding assay (DCC) as it is more economical and yields concurrent histopathologic correlation.  Several anti-ER antibodies are used clinically; two mouse monoclonals (6F11 and 1D5) have been compared in clinical studies and shown to have similar sensitivities(5, 6). Studies have also compared 1D5 with a new rabbit monoclonal, SP1(7, 8), which has eightfold higher affinity(8). Our preliminary studies found SP1 more sensitive than 1D5 for detecting estrogen receptor expression in breast cancer, in both a duplicate redundancy 431-case tissue microarray, and in 121 whole sections of clinical materials from multiple institutions(9) (Supplemental Table 3.1). Here we present the first population-based series comparing the immunohistochemical detection of ER by antibodies 1D5 and SP1 and results from DCC in patients with long-term clinical follow-up, to investigate their prognostic values in breast cancer.  MATERIALS AND METHODS Study Population  The study cohort is 4,150 female patients with newly diagnosed, invasive breast cancer in British Columbia whose tumor specimens were tested by a central estrogen receptor (ER) laboratory at Vancouver Hospital between 1986 and 1992. The DCC protocol is as published(10-12) and reproduced on our supplemental website (http://www.gpec.ubc.ca/index.php?content=papers/ER.php). Median follow-up was 12.4   74 years and age at diagnosis 60 years. All patients had been referred to the British Columbia Cancer Agency and have staging, pathology, treatment and follow-up information(13, 14). During the study era, 75% of breast cancer cases in the province were referred; non-referred patients were generally elderly or treated by mastectomy without indications for adjuvant therapy(15).   Abstracted clinical information includes age, histology, grade, tumor size, number involved axillary nodes, lymphatic or vascular invasion (LVI), ER status by the DCC method(11), type of local and initial adjuvant systemic therapy (AST), dates of diagnosis, first local, regional or distant recurrence and death.  A subset of these patients was included in a recent population-based study validating the prognostic model ADJUVANT!(16). Supplemental Table 3.2 summarizes cohort characteristics. The study was approved by the Clinical Research Ethics Board of the University of British Columbia and BC Cancer Agency.  Tissue Microarrays and Immunohistochemistry  The Vancouver Hospital ER laboratory retained single archival blocks from each case. This material had been frozen prior to neutral buffered formalin fixation. Hematoxylin/eosin stained slides from these blocks were reviewed by two pathologists to identify areas of invasive breast carcinoma. Tissue microarrays (TMAs) were constructed as described(17, 18) (details reproduced on Supplemental Publication site http://www.gpec.ubc.ca/index.php?content=papers/ER.php). Using one core per case, 17 TMA blocks were required. 4µm thin sections were immunostained using   75 DakoCytomation EnVision and System-HRP in a two-step technique. Slides were deparaffinized with xylene and rehydrated through three alcohol changes. Endogenous peroxidase activity was quenched by incubating five minutes with 0.03% hydrogen peroxide/sodium azide.  Slides were then incubated with one of two primary anti-ER antibodies, 1D5 (1:100; Dako Corporation, Caprinteria, CA, USA) or SP1 (1:250; LabVision, Fremont, CA USA), followed by the peroxidase-labeled polymer, in a Tris- HCl buffer containing stabilizing protein and an anti-microbial agent, using sequential 30 minute incubations. Staining was completed by 10 minute incubation with 3,3’- diaminobenzidine (DAB)+ substrate-chromogen. Primary antibody was omitted in negative controls. External positive controls were slides from breast cancers with previously documented ER expression. A previous study using a duplicate-redundancy 431-case tissue microarray demonstrated 96% agreement between duplicate cores, for both 1D5 and SP19 (Supplemental Table 3.1).  Stained TMA slides were digitally scanned and linked to a relational database(19) and are publicly available for review. (https://www.gpecimage.ubc.ca/tma/web/viewer.php), username: ersp11D5, password: er4150).  ER Scoring System  TMAs were visually scored by two pathologists (DT, AMG) for percentage of tumor cell nuclear positivity, and scored as Negative (<1%); Positive: 1+ (1-25%), Positive: 2+ (25-75%), or Positive: 3+ (>75%). For most analyses, IHC scores are dichotomized at ! 1 = ER positive(20). Pathologists were blinded to clinical outcomes.   76  Statistical Analysis  Statistical analysis was performed using SPSS 13.0 (Chicago, IL) and R 2.1.1 (http://www.r-project.org).  In univariate analyses, Overall survival (OS), breast cancer specific survival (BCSS) and relapse-free survival (RFS) were estimated using Kaplan- Meier(21) curves, and survival differences determined by log-rank tests(22, 23). A trend test was used when three or more ordered groups were compared. For BCSS, survival time was censored at death if the cause was not breast cancer, or if the patient was still alive at the end of the study. 6 patients with unknown cause of death were excluded from BCSS analysis. For RFS, survival time was also censored at death if the cause was not breast cancer, or the patient was alive without relapse at the end of the study. For OS, survival time was censored if the patient was still alive at study end. Cox proportional hazards(22, 24) models were used to calculate adjusted hazard ratios accounting for covariates. Hypothesis testing was performed using Wald’s statistic. Smoothed plots of weighted Schoenfeld residuals were used to test proportional hazard assumptions(25). Kappa(26) statistics and Kendall’s tau-b(27) tests were used to measure agreement between the two ER immunostains and DCC assay, and correlation of ER status to pathological variables.  Differences involving pathological factors were compared using Pearson’s chi-square(28) and Mann-Whitney U(29) for categorical and continuous variables respectively. All statistical tests were two-sided and p-values < 0.05 were considered significant.    77 RESULTS  Our previous study(9) found that, to detect ER expression, SP1 is more sensitive than 1D5 on both whole sections (5.3% absolute increased sensitivity) and tissue microarrays (6.6%) and suggested that SP1 is the better prognostic marker for breast cancer survival (Supplemental Table 3.1). In this study we assessed the value of SP1 on a population-based series of 4,150 invasive breast cancers. 48% of patients had tumor !2cm, 51% tumor grade III and 44% positive nodal status (Supplemental Table 3.2). Among the 1,838 patients receiving lumpectomy, 91% were given radiation; whereas 29% of the 2241 patients receiving mastectomy were given radiation. 41% of patients received no adjuvant systemic therapy (AST) and 33% received tamoxifen-only AST.  Comparison of ER Expression by SP1 and 1D5 with DCC and Clinicopathological Characteristics  Table 3.1 summarizes the immunohistochemical staining results for SP1 and 1D5 on TMAs. Figures 3.1a-c are from the same tumor sample negative for ER by 1D5 but strongly positive by SP1.  This case had an ER concentration of 48 fmol/mg by DCC. Figures 1d-f are from another tumor sample, moderately positive by1D5 but strongly positive by SP1 (ER = 174 fmol/mg).   The number of interpretable cases for the two immunostains is slightly different due to occasional core dropout from TMAs during sectioning and staining.  Overall, 69.5% of cases were positive by SP1 versus 63.1% by 1D5.  In absolute terms, the SP1 antibody had 11% more strong 3+ stains than 1D5, and 1D5 had 6.4% more negative   78 stains than SP1. There is a strong positive correlation between the two antibodies (Kendall’s tau-b: 0.790, p<1.0x10 -17 ).   Among clinical tumor samples, 3,884 cases were originally tested by DCC assay and categorized into 4 groups(10, 11): negative (<=1 fmol/mg), low (2-9 fmol/mg), moderate (10-159 fmol/mg), and high (!160 fmol/mg). Frequencies were 2.9% (111/3884), 17.5% (678/3884), 41.1% (1596/3884) and 38.6% (1499/3884) respectively. Values of DCC ! 10.0 fmol/mg were considered positive (the clinical cut-point)(11, 12). ER status by SP1 (Kappa: 0.654, sensitivity 86%, specificity 92%, p<1.0x10 -17 ) agreed better with DCC than did 1D5 (Kappa: 0.536, sensitivity 78%, specificity 92%, p<1.0x10 -17 )(Table 3.2a).   4,105 cases were interpretable for both antibodies. SP1 stained 8% more positive cases that otherwise were negative by 1D5 (Table 3.2b). For these 337 discrepant cases, 92% were DCC positive (median concentration 67.0 fmol/mg).  Among the 77 discrepant cases identified as negative by SP1 but positive by 1D5, 65% were DCC positive (median concentration 34.0 fmol/mg). The SP1+/1D5- discordant cases had significantly higher ER concentrations by DCC  (p=0.021) and significantly lower tumor grades (p=0.022) than the SP1-/1D5+ discordant cases; there was no significant difference in tumor size (p=0.110). Among the SP1-/1D5- group, 64% were DCC negative; among the SP1+/1D5+ group, 98% were DCC positive.    79  SP1+, 1D5+ and DCC+ all correlate with older age at diagnosis (Kendall’s tau-b (p values): 0.149 (<1.0x10 -17 ), 0.173 (<1.0x10 -17 )  and 0.165 (<1.0x10 -17 ) respectively), but inversely correlate with grade (Kendall’s tau-b (p values): -0.241 (9.96x10 -60 ), -0.196 (1.88x10 -38 ) and -0.282 (3.25x10 -86 )) and tumor size (Kendall’s tau-b (p values): - 0.107(4.56x10 -12 ), -0.083 (7.42x10 -8 ) and -0.108 (1.2x10 -11 )).  Prognostic Values of SP1, 1D5 and DCC Positive Expression  To assess the prognostic value of ER in a general population, we examined SP1, 1D5 and DCC in the whole cohort. Both immunostains and DCC identified ER positive patients as having an improved BCSS. DCC assay demonstrates the most significant and strongest linear trend for expression: the higher expression, the better BCSS (Figure 3.2a (SP1): p=4.78x10 -13 , 3.2b (1D5): p=1.65x10 -7  and 3.2c (DCC) p=6.08x10 -16 ). A similarly significant trend is observed for RFS (SP1: p=3.98x10 -8 , 1D5: p=3.71x10 -6  and DCC: p=9.25x10 -14 ) but not OS (SP1: p=0.132, 1D5: p=0.511 and DCC: p=0.154). The 5-and 10-year survival probabilities of binarized ER status from Kaplan Meier analysis are reported in Supplemental Table 3.3a. The increased survival for ER+ patients starts to attenuate after 5 years. The same phenomenon is seen with all 3 detection methods.   AST for this cohort was prescribed according to guidelines based on age, tumor size, LVI, nodal status and DCC-determined ER level(16). High risk was defined as node-positive, or if node-negative, presence of LVI, or tumor > 2cm and ER negative (DCC < 10 fmol/mg). Patients deemed “low risk” at the time of diagnosis were not given any AST; they had a better outcome (10-yr BCSS 82% (80-84 95% C.I.)) than patients   80 receiving AST such as the tamoxifen group (10-yr BCSS 69% (66-71), p=8.4x10 -21 ). Survival analyses based on ER status were therefore done separately in these two subgroups.   For the “pure prognostic” subset of patients receiving no AST, SP1+ was associated with 14% absolute increased BCSS (p=5.0x10 -8 ) at 10 years, 1D5+ with 9% (p=2.2x10 -4 ) and DCC+ with 19% (p=2.3x10 -12 , Supplemental Table 3.3b). For RFS, SP1+ and DCC+ were significantly increased 6% (p=0.002) and 13% (p=5.1x10 -7 ) at 10 years. 1D5+ showed 3% increased 10-yr RFS (p=0.06).   Among 1,377 patients receiving tamoxifen as their only AST, only 84 were ER negative by DCC. SP1, 1D5 and DCC all identified ER positive patients as having better outcomes than ER negative cases.  The absolute increased 10-yr BCSS for SP1+, 1D5+ and DCC+ are: 14% (p=4.7x10 -6 ), 10% (p=1.5x10 -4 ) and 24% (p=3.3x10 -7 ); the increased 10-yr RFS for SP1+, 1D5+ and DCC+ are: 10% (p=1.7x10 -4 ), 7% (p=0.001) and 18% (p=3.9x10 -5 ) respectively (Supplemental Table 3.3c).  Concordant and Discordant Cases between SP1 and 1D5  As expected, SP1+/1D5+ double positive cases have better BCSS (hazard ratio 0.667; 95%C.I. 0.592-0.753) compared to SP1-/1D5- cases (p=5.00x10 -11 ; Figure 3.3). The SP1+/1D5- group has a hazard ratio of 0.647 (0.423-0.989, p=0.044) compared to the SP1-/1D5+ group.  Importantly, there are no significant survival differences between the SP1+/1D5+ and SP1+/1D5– cases (p=0.408).  The SP1–/1D5+ group does not have   81 significantly different survival from the SP1–/1D5– group (HR 0.93 (0.63-1.36) p=0.698) but appears inferior to the SP1+ group (HR 1.45 (0.99-2.11), p=0.055). In relapse-free survival analysis, the SP1–/1D5+ cases have a similar hazard compared to the SP1– /1D5– group, with hazard ratio 0.972 (0.683-1.38), whereas SP1+/1D5– cases have relapse free hazard similar to SP1+/1D5+ (1.07 (0.885-1.28)) (Supplemental Figure 3.1). The same pattern in BCSS and RFS is seen in both the no AST and tamoxifen treated only subsets (data not shown). These results further support that SP1 antibody is a better prognostic marker than the commonly used ER antibody 1D5.  Multivariable Analysis  Smoothed rescaled Schoenfeld residuals plots were used to test proportional hazard assumptions. All covariates followed proportional hazards, except ER status, which varied slightly during the long period of follow-up. The hazard rate of breast cancer death and relapse is different in the first 5 years, consistent with data reported by Hess et al(30). Supplemental Table 3.4a-b summarizes the adjusted hazard ratios of ER positive status detected by the two immunostains and DCC; the reported values were determined from Cox regression models including age, tumor size, grade, LVI and nodal status as covariates for BCSS (Supplemental Table 3.4a) and RFS (Supplemental Table 3.4b) respectively.  The results show that SP1, 1D5 and DCC all work efficiently as independent prognostic factors for BCSS and RFS in a general population after adjusting for the listed clinico-pathological prognosticators. However, neither SP1, 1D5 nor DCC remained significant among the low-risk patients receiving no AST.    82  For patients receiving tamoxifen as their only AST, SP1, 1D5 and DCC each retained significance when individually added to a model including the same clinical parameters listed above as covariates. The BCSS hazard ratios and p-values for SP1+, 1D5+ and DCC+ were: 0.636 (95% C.I. 0.499-0.811) p=2.55x10 -4 , 0.697 (0.559-0.870) p=1.37x10 -3  and 0.649 (0.450-0.935) p=2.03 x 10 -2  respectively; the RFS hazard ratios and p-values for SP1+, 1D5+ and DCC+ were: 0.683 (0.540-0.864) p=1.45x10 -3 , 0.744 (0.601-0.921) p=6.59x10 -3  and 0.709 (0.491-1.03) p=0.068. In order to test which detection method was more significantly associated with prognosis, a Cox regression model with age, tumor size, grade, LVI and nodal status was fitted with each of SP1, 1D5 and DCC. SP1 was the most significant prognostic factor among the three ER detection methods, both in BCSS (HR: 0.664(0.517-0.852), p=1.27x10 -3 ) and RFS (HR: 0.699(0.550-0.888), p=3.40x10 -3 ) in this subgroup. The hazard ratios of the clinico-pathological covariates are listed in Table 3.3a and 3.3b.  DISCUSSION  Using population-based TMAs of over 4000 patients with long-term follow-up, this study not only validates the prognostic value of estrogen receptor immunostaining in breast carcinomas with and without adjuvant tamoxifen, but also demonstrates the superiority of the SP1 rabbit monoclonal antibody to the standard 1D5 mouse monoclonal, both in agreement with DCC assay and in correlation with outcome.  ER is a well-established determinant of survival and correlate of other clinico-pathological variables(31). Accordingly, the vast majority of our analyses are confirmatory validations rather than tests of hypotheses, and a Bonferroni correction would be inappropriate.   83   Because of reduced cost, shorter turn-around, morphologic correlation and ease of specimen handling, immunohistochemical testing for ER has largely replaced DCC assays(32), so good antibodies play an important role in clinical decision-making. Developed against an N-terminal epitope of ER-alpha, the mouse monoclonal antibody 1D5 is currently in wide use(20). Another mouse monoclonal 6F11 has similar sensitivity(33, 34). Previous studies employing limited numbers of cases without outcome data suggested superior clinical sensitivity of the recently developed rabbit monoclonal antibody SP1, recognizing the ER-alpha C-terminal portion(7).   We show that SP1 is superior to 1D5 in identifying ER+ patients who have a good prognosis, in the whole cohort (representing a British Columbia population), in patients receiving no AST (a pure prognostic group) and in patients receiving solely tamoxifen AST. Our data suggests that 1D5 fails to identify some ER+ patients who would benefit from adjuvant tamoxifen, thus potentially denying a well-tolerated, efficacious treatment to approximately 6% of breast cancer patients.   SP1 more closely approximates the prognostic value of DCC assay than does 1D5. This suggests that IHC using SP1 is a better substitute for the previous “gold standard” in a clinical diagnostic setting. Although immunohistochemistry assays in general can suffer from interlaboratory variability, ER immunohistochemistry has been shown to be robust and reproducible(35).    84  In this study, ER expression correlates with substantially greater improvements in disease-specific and relapse-free survival among tamoxifen treated patients than among patients receiving no AST. In the latter group, although ER+ patients (detected by any method) had better survival probabilities, their survival advantage slowly decreased and eventually crossed at 18 years (Supplemental Table 3.2b and Supplemental Figure 3.2). This suggests some ER+ patients have slow growing but high-risk tumors not related to tamoxifen resistance.   Quantitative DCC results showed the best positive linear trend for better survival and DCC is better than semiquantitative visual IHC to identify the dose-effect of ER on outcome. Quantitative image analysis may be necessary for IHC techniques to match DCC assay in this regard (32, 35, 36).   Because of the exceptionally large study size, TMAs were constructed of single cores Previously data from our laboratory found 96% agreement between duplicate cores for SP1 and for 1D5 immunostains(9) (Supplemental Table 3.1). The almost fivefold difference in the numbers of SP1+/1D5- versus SP1-/1D5+ discordant cases in the current larger study argues strongly that the SP1+/1D5- cases (illustrated in Figure 3.1) do not merely represent false negative 1D5 results. Because the applicability of these results to whole tissue sections might be questioned, we performed studies comparing SP1 and 1D5 in an unselected multi-institutional series of 121 whole section(9) breast cancer specimens, and the absolute ER positivity rate was still 5.3% higher with SP1 compared with 1D5.  This study also tested the two antibodies on a single-institution 431-   85 case formalin-fixed breast cancer tissue microarray independent of the one presented here; as might be expected, the absolute rates of ER positivity were found to be higher in whole sections than in tissue microarrays [SP1: 83.5% (whole section) vs. 79.3% (TMA); 1D5: 78.2% (whole section) vs. 72.7% (TMA); Supplemental Table 3.1], but importantly the relative increase in sensitivity of SP1 vs. 1D5 was maintained in both whole sections (5.3%) and the TMA format (6.6%) and agrees with the results presented here on 4150 cases (6.4%). Although it is difficult to compare absolute rates because the contributing patient populations were different, in this 4150 case series, both antibodies did show lower absolute rates of positivity, which might reflect the fact that the source tissues (as per DCC protocol) were frozen prior to fixation. The fact that DCC assay was positive in 36% of the tumors which were negative by both 1D5 and SP1 IHC supports loss of ER immunoreactivity in tissues frozen prior to fixation.    We provide evidence that the new rabbit monoclonal antibody SP1 represents an improved standard for immunohistochemical estrogen receptor assessment in breast cancer. Recent studies have made accurate assessment of ER status even more critical, given its implications for predicting response to systemic therapies(37).               86 Table 3.1. Frequences of SP1 and 1D5. Frequencies of estrogen receptor antibodies SP1 and 1D5 immunostaining expression in 4,150 Breast Cancer Patient Samples    SP1 1D5 Expression Level No. of Patients % No. of  Patients % Negative (<1%) 1,255 30.2 1,520 36.6 1%-25% 358 8.6 409 9.9 25%-75% 1,272 30.7 1,420 34.2 >75% 1,231 29.7 775 18.7 Uninterpretable / missing core 34 0.8 26 0.6 Total 4,150 100 4,150 100  Table 3.2 Comparison of SP1 and 1D5 against DCC. (a) Comparison of ER status as detected by SP1 and 1D5 immunohistochemistry, using DCC assay as the Gold Standard. (b) Frequencies of SP1 and 1D5 discrepancies among patient samples interpretable for both SP1 and 1D5 antibodies. NOTE. SP1 identifies 8.2% (337/4105) positive of cases that would have been considered negative by 1D5, and 1.9% (77/4105) negative of cases considered positive by 1D5.  (a)  ER Concentration Detected by DCC (fmol/mg)  ER Status Result <10 !10 Predictive Value Detected by SP1 SP1 negative 716 445 Negative, 62%  SP1 positive 64 2,628 Positive, 98%   Specificity, 92% Sensitivity, 86%  Detected by 1D5 1D5 negative 718 687 Negative, 51%  1D5 positive 64 2,392 Positive, 97%  Specificity, 92% Sensitivity, 78%   (b) SP1 Negative Positive Total Negative 1,176 337 1,513 1D5 Positive 77 2,515 2,592  Total 1,253 2,852 4,105     87  Table 3.3. Cox model of SP1, 1D5 and DCC among patients treated with adjuvant tamoxifen only. Cox proportional hazard regression analysis to test the best ER detection methods (SP1, 1D5 and DCC) on patients receiving tamoxifen only as their AST. Age at diagnosis, grade, tumor size, lymphovascular invasion (LVI) and nodal status were included as covariates. HR= adjusted hazard ratio and C.I. =  confidence interval. Cases with missing values in any of the covariates or ER status were excluded in the analysis. Hazard estimates were not computed for the insignificant test variables. The total number of cases (n) included in each regression model is listed accordingly. (a) Breast Cancer Disease Specific Survival (BCSS) analysis. (b) Relapse-free Survival (RFS) analysis.  (a) BCSS   BCSS HR (95% C.I.), p-value   Total n = 1150 Age at Diagnosis 40-49 vs < 40 0.067 (0.008-0.536) 1.09 x 10-2 50-65 vs < 40 0.049 (0.007-0.367) 3.28 x 10-3 > 65 vs < 40 0.061 (0.008-0.450) 6.11 x 10-3 Grade II vs I 1.19 (0.599-2.35) 0.623 III vs I 2.20 (1.12-4.33) 2.23 x 10 -2  Tumor Size (cm) 2  < size !5 vs ! 2 1.53 (1.23-1.90) 1.54 x 10 -4  > 5 vs ! 2 3.11 (2.16-4.49) 1.23 x 10 -9  LVI Positive vs negative 1.15 (0.925-1.44) 0.204 Nodal Status Positive vs negative 2.04 (1.60-2.59) 7.65 x 10 -9  ER status* by SP1 Positive vs negative 0.664 (0.517-0.852) 1.27 x 10 -3  * In a model including SP1, 1D5 and DCC, SP1 is selected as the most significant contributor to BCSS (1D5 p = 0.267; DCC p = 0.372). In a model including 1D5 as the only ER measure, 1D5+ HR 0.699 (0.557-0.877), p = 1.93 x 10 -3 . In a model including DCC as the only ER measure, DCC+ HR 0.648 (0.450-0.934), p = 0.0200.       88  (b) RFS   RFS HR (95% C.I.), p-value   Total n = 1104 Age at Diagnosis 40-49 vs < 40 0.151 (0.019-1.17) 7.04 x 10 -2  50-65 vs < 40 0.071 (0.010-0.522) 9.41 x 10 -3  > 65 vs < 40 0.080 (0.011-0.589) 1.32 x 10 -2  Grade II vs I 1.55 (0.786-3.05) 0.206 III vs I 2.64 (1.35-5.18) 4.75 x 10 -3  Tumor Size (cm) 2  < size !5 vs ! 2 1.50 (1.22-1.84) 9.92 x 10 -5  > 5 vs ! 2 2.65 (1.81-3.87) 4.73 x 10 -7  LVI Positive vs negative 1.23 (0.995-1.52) 5.52 x 10 -2  Nodal Status Positive vs negative 1.84 (1.47-2.29) 7.64 x 10 -8  ER status* by SP1 Positive vs negative 0.699 (0.550-0.888) 3.40 x 10 -3      * In a model including SP1, 1D5 and DCC, SP1 is selected as the most significant contributor to BCSS (1D5 p = 0.555; DCC p = 0.577). In a model including 1D5 as the only ER measure, 1D5+ HR 0.754 (0.606-0.939), p = 0.0116. In a model including DCC as the only ER measure, DCC+ HR 0.709 (0.490-1.03), p = 0.0675.     89 Supplemental Table 3.1 Frequencies of SP1 and 1D5 using TMA and Whole Sections. (a) TMA of 431 formalin-fixed invasive breast carcinomas, with duplicate core samples: Concordance and Discordance between SP1 and 1D5 immunostains. NOTE. Overall, SP1 was positive in 79.3% (314/396) of patients and 1D5 was positive in 72.7% (282/388) of patients. Agreement between duplicate cores for SP1 stain: 95.7% agreement (kappa statistics: 0.876, p = 4.12 x 10-56). Agreement between duplicate cores for 1D5 stain: 96.0% agreement (kappa statistics: 0.901, p = 8.48 x 10-59). Prognostic Value of SP1 and 1D5 for BCSS: SP1-positive HR, 0.577 (0.393-0.849) to SP1 negative; p-value = 0.005. 1D5-positive HR, 0.619 (0.428-0.895) to 1D5 negative; p-value = 0.011. Ten-year BCSS: SP1 positive/1D5 positive, 75%; SP1 positive/1D5 negative, 71%; SP1 negative/1D5 positive, 67% (based on three patients); SP1 negative/1D5 negative, 60%.      ER status by SP1     Negative Positive Total Negative 73 30 103 E R  st at u s b y  1 D 5  Positive 3 277 280   Total 76 307 383   (b) Whole Sections of Invasive Breast Carcinomas (case series derived from multiple US institutions; patient outcomes not available): Concordance and Discordance Between SP1 and 1D5. NOTE. Overall, SP1 was positive in 83.5% of patients (101 of 121), and 1D5 was positive in 78.2% of patients (93 of 119) on the same series of tumors.      ER status by SP1     Negative Positive Total Negative 18 8 26 E R  s ta tu s  b y  1 D 5  Positive 1 92 93   Total 19 100 119    90 Supplemental Table 3.2: Patients’ characteristics. Summary of Clinical-pathological characteristics of the 4,150 patients with invasive breast tumors included in the tissue microarray series.    Number % Diagnostic Age (yr) <40 301 7.3 40-49 857 20.7 50-65 1478 35.6 >65 1514 36.5 Menstrual status at referral Premenopausal 1202 28.96 Postmenopausal 2849 68.65 Pregnant 2 0.05 Unknown 97 2.34 Tumor Size (cm) ! 2 2116 51.0 >2 and !5 1747 42.1 >5 241 5.8 Unknown 46 1.1 Grade I (well differentiated) 214 5.2 II (moderately well or partially differentiated) 1615 38.9 III (poorly differentiated) 2133 51.4 Unknown 188 4.5 Histology Ductal 3756 90.5 Lobular 316 7.6 Other 78 1.9 Nodal Status Negative 2311 55.7 Positive 1825 44.0 Unknown 14 0.3 Local Therapy No breast surgery 71 1.7 Mastectomy + radiation therapy 650 15.7 Mastectomy alone 1591 38.3 Lumpectomy alone 167 4.0 Lumpectomy + radiation therapy 1671 40.3 Adjuvant Systemic Therapy None 1702 41.0 Tamoxifen only 1372 33.1 Ovarian ablation or hormone other than Tamoxifen 7 0.2 Chemotherapy only (AC, CMF, FAC, CEF)  749 18.0 Chemotherapy + Tamoxifen 315 7.6 Ovarian ablation or hormone + Chemotherapy 5 0.1    91 Supplemental Table 3.3: Survival Probabilities of SP1, 1D6 and DCC. The 5- and 10-years survival probabilities of SP1, 1D5 and DCC from Overall Survival (OS), Breast Cancer Death Specific Survival (BCSS), and Relapse-free Survival (RFS) on (a) All Patients, (b) Patients receiving no Adjuvant Systemic therapy (AST) and (c) Patients treated by Tamoxifen only as AST. Note that the absolute increase of BCSS or RFS for ER+ at 10 years had a lower magnitude than at 5 years. IHC= Immunohistochemistry.  (a) All Patients  SP1- (n=1255) SP1+ (n=2861) 1D5- (n=1520) 1D5+ (n=2604) < 10fmol/mg (n=789)  ! 10 fmol/mg (n=3095) 5 yr OS (95% C.I.) 68% (65-70) 83% (81-84) 71% (68-73) 83% (81-84) 63% (60-67) 82% (80-83) 10 yr OS (95% C.I.) 57% (55-60) 64% (62-65) 59% (57-62) 63% (61-65) 53% (50-57) 63% (62-65) Log-rank test p-values SP1- (n=1255) SP1+ (n=2855) 1D5- (n=1520) 1D5+ (n=2598) < 10fmol/mg (n=789)  ! 10 fmol/mg (n=3089) 5 yr BCSS (95% C.I.) 72% (69-74) 88% (87-89) 75% (73-77) 88% (87-89) 67% (64-71) 87% (86-88) 10 yr BCSS(95% C.I.) 65% (62-68) 77% (75-78) 68% (65-70) 76% (74-78) 61% (57-64) 76% (74-77) Log-rank test p-values SP1- (n=1227) SP1+ (n=2797) 1D5- (n=1485) 1D5+ (n=2547) < 10fmol/mg (n=770)  ! 10 fmol/mg (n=3028) 5 yr RFS (95% C.I.) 64% (61-67) 78% (76-79) 66% (64-69) 78% (76-79) 59% (56-63) 77% (75-79) 10 yr RFS (95% C.I.) 59% (56-61) 66% (64-68) 60% (58-63) 66% (64-67) 55% (52-59) 66% (64-68) Log-rank test p-values O v er a ll  S u rv iv a l (O S ) 0.0065 0.7754 IHC by SP1 IHC by 1D5 B re a st  c a n ce r d ea th  S p ec if ic  S u rv iv a l (B C S S ) R el a sp e- F re e S u rv iv a l (R F S ) 4.7 x 10 -7 3.0 x 10 -12 DCC 0.0001 8.43 x 10 -14 5.05 x 10 -7 5.95 x 10 -10 2.17 x 10 -5   92 (b) Patients receiving no AST  (c) Patients receiving Tamoxifen only as AST SP1- (n=570) SP1+ (n=1115) 1D5- (n=687) 1D5+ (n=1001) < 10fmol/mg (n=381)  ! 10 fmol/mg (n=1193) 5 yr OS (95% C.I.) 75% (71-79) 88% (86-90) 76% (78-81) 88% (85-90) 69% (64-74) 88% (86-90) 10 yr OS (95% C.I.) 63% (59-67) 71% (68-74) 67% (63-70) 69% (66-72) 59% (53-63) 72% (69-74) Log-rank test p-values SP1- (n=570) SP1+ (n=1111) 1D5- (n=687) 1D5+ (n=997) < 10fmol/mg (n=381)  ! 10 fmol/mg (n=1189) 5 yr BCSS (95% C.I.) 80% (76-83) 94% (93-95) 83% (80-85) 94% (92-95) 74% (70-79) 94% (92-95) 10 yr BCSS (95% C.I.) 73% (69-77) 87% (85-89) 77% (73-80) 86% (84-88) 68% (63-72) 87% (85-89) Log-rank test p-values SP1- (n=567) SP1+ (n=1114) 1D5- (n=1485) 1D5+ (n=2547) < 10fmol/mg (n=770)  ! 10 fmol/mg (n=3028) 5 yr RFS (95% C.I.) 71% (67-75) 82% (79-84) 74% (70-77) 81% (79-84) 65% (60-70) 82% (80-85) 10 yr RFS (95% C.I.) 66% (61-69) 72% (69-75) 68% (64-71) 71% (68-74) 61% (56-66) 74% (71-76) Log-rank test p-values IHC by SP1 IHC by 1D5 DCC O v er a ll  S u rv iv a l (O S ) 0.0598 0.3957 0.0004 B re a st  c a n ce r d ea th  S p ec if ic  S u rv iv a l (B C S S ) 5.0 x 10 -8 2.2 x 10 -4 R el a sp e- F re e S u rv iv a l (R F S ) 2.3 x 10 -12 0.002 0.06 5.1 x 10 -7 SP1- (n=233) SP1+ (n=1135) 1D5- (n=313) 1D5+ (n=1057) < 10fmol/mg (n=84)  ! 10 fmol/mg (n=1218) 5 yr OS (95% C.I.) 57% (50-63) 79% (76-81) 63% (57-68) 79% (76-81) 45% (34-55) 77% (74-79) 10 yr OS (95% C.I.) 46% (40-53) 56% (53-59) 50% (44-55) 56% (53-59) 37% (27-47) 55% (52-58) Log-rank test p-values SP1- (n=232) SP1+ (n=1127) 1D5- (n=311) 1D5+ (n=1050) < 10fmol/mg (n=84)  ! 10 fmol/mg (n=1216) 5 yr BCSS (95% C.I.) 63% (57-69) 86% (84-88) 70% (64-74) 86% (83-88) 53% (42-63) 84% (81-86) 10 yr BCSS (95% C.I.) 57% (50-64) 71% (68-74) 61% (55-66) 71% (68-74) 46% (34-56) 70% (67-72) Log-rank test p-values SP1- (n=222) SP1+ (n=1082) 1D5- (n=297) 1D5+ (n=1009) < 10fmol/mg (n=76)  ! 10 fmol/mg (n=1164) 5 yr RFS (95% C.I.) 59% (53-66) 78% (76-81) 64% (59-70) 78% (76-81) 48% (36-59) 76% (74-79) 10 yr RFS (95% C.I.) 54% (47-61) 64% (61-67) 57% (51-63) 64% (61-67) 45% (33-56) 63% (60-66) Log-rank test p-values IHC by SP1 IHC by 1D5 DCC O v er a ll  S u rv iv a l (O S ) 0.0005 0.0093 1.5 x 10 -4 B re a st  c a n ce r d ea th  S p ec if ic  S u rv iv a l (B C S S ) 4.7 x 10 -6 1.5 x 10 -4 3.3 x 10 -7 R el a sp e- F re e S u rv iv a l (R F S ) 1.7 x 10 -4 0.001 3.9 x 10 -5   93 Age at Diagnosis 40-49 vs < 40 0.798 (0.640-0.995) 4.53 x 10 -2 0.778 (0.624-0.970) 2.58 x 10 -2 0.780 (0.622-0.980) 3.27 x 10 -2 50-65 vs < 40 0.921 (0.749-1.13) 0.436 0.911 (0.741-1.12) 0.377 0.906 (0.733-1.12) 0.364 > 65 vs < 40 1.04 (0.837-1.28) 0.75 1.02 (0.821-1.26) 0.889 1.02 (0.821-1.27) 0.851 Grade II vs I 1.74 (1.09-2.76) 2.00 x 10 -2 1.75 (1.10-2.79) 1.76 x 10 -2 1.51 (0.949-2.41) 8.22 x 10 -2 III vs I 2.67 (1.69-4.24) 2.74 x 10 -5 2.76 (1.74-4.37) 1.51 x 10 -5 2.33 (1.47-3.70) 3.23 x 10 -4 Tumor Size (cm) 2 < size ! 5 vs ! 2 1.68 (1.48-1.92) 4.71 x 10 -15 1.69 (1.48-1.93) 2.71 x 10 -15 1.67 (1.47-1.91) 3.00 x 10 -14 > 5 vs ! 2 2.28 (1.84-2.83) 7.10 x 10 -14 2.34 (1.89-2.90) 9.27 x 10 -15 2.28 (1.83-2.84) 2.64 x 10 -13 LVI Positive vs negative 1.38 (1.20-1.59) 4.40 x 10 -6 1.38 (1.20-1.58) 6.68 x 10 -6 1.38 (1.20-1.59) 8.57 x 10 -6 Nodal Status Positive vs negative 2.13 (1.85-2.45) 2.76 x 10 -26 2.12 (1.84-2.43) 8.18 x 10 -26 2.12 (1.84-2.44) 4.76 x 10 -25 ER status Positive vs negative 0.725 (0.638-0.824) 8.174 x 10 -7 0.778 (0.687-0.881) 7.56 x 10 -5 0.706 (0.611-0.815) 2.16 x 10 -6 Total n = 3755 (ER by SP1) Total n = 3765 (ER by 1D5) Total n = 3561 (ER by DCC) BCSS HR (95% C.I.), p-value Supplemental Table 3.4: Cox Model in whole cohort. Cox proportional hazard regression analysis for ER status on the whole cohort, including age at diagnosis, grade, tumor size, LVI and nodal status as covariates. Each of SP1+, 1D5+ and DCC+ were estimated in a separate model, and listed accordingly. HR= adjusted hazard ratio and C.I. = confidence interval. Cases with missing values in any of the covariates were excluded in the analysis. The total number of cases (n) included in each regression model is listed accordingly. (a) Breast Cancer Disease Specific Survival (BCSS) analysis. (b) Relapse-free Survival (RFS) analysis.  (a) BCSS              94  (b) RFS                    Age at Diagnosis 40-49 vs < 40 0.812 (0.666-0.991) 4.05 x 10 -2 0.804 (0.659-0.980) 3.10 x 10 -2 0.823 (0.670-1.01) 6.34 x 10 -2 50-65 vs < 40 0.809 (0.670-0.977) 2.78 x 10 -2 0.810 (0.670-0.978) 2.81 x 10 -2 0.814 (0.669-0.990) 3.96 x 10 -2 > 65 vs < 40 0.834 (0.687-1.01) 6.66 x 10 -2 0.831 (0.684-1.01) 6.12 x 10 -2 0.859 (0.702-1.05) 0.139 Grade II vs I 1.67 (1.16-2.40) 6.21 x 10 -3 1.68 (1.16-2.42) 5.54 x 10 -3 1.59 (1.08-2.33) 1.86 x 10 -2 III vs I 2.34 (1.63-3.36) 4.51 x 10 -6 2.37 (1.65-3.41) 2.93 x 10 -6 2.21 (1.51-3.25) 4.63 x 10 -5 Tumor Size (cm) 2 < size ! 5 vs ! 2 1.44 (1.29-1.61) 3.30 x 10 -10 1.45 (1.29-1.62) 1.78 x 10 -10 1.47 (1.31-1.65) 1.23 x 10 -10 > 5 vs ! 2 1.73 (1.40-2.14) 4.80 x 10 -7 1.75 (1.41-2.16) 2.58 x 10 -7 1.78 (1.43-2.21) 1.94 x 10 -7 LVI Positive vs negative 1.26 (1.11-1.42) 2.48 x 10 -4 1.26 (1.11-1.42) 2.51 x 10 -4 1.28 (1.13-1.45) 1.10 x 10 -4 Nodal Status Positive vs negative 1.76 (1.56-2.00) 8.29 x 10 -20 1.76 (1.56-1.99) 1.16 x 10 -19 1.76 (1.55-1.99) 1.17 x 10 -18 ER status Positive vs negative 0.836 (0.744-0.940) 2.71 x 10 -3 0.854 (0.763-0.955) 5.90 x 10 -3 0.806 (0.704-0.922) 1.63 x 10 -3 RFS HR (95% C.I.), p-value Total n = 3682 (ER by SP1) Total n = 3692 (ER by 1D5) Total n = 3492 (ER by DCC)   95 Figure 3.1. Representative TMA Images of SP1 and 1D5. a-c case #953, an infiltrating ductal carcinoma: (a) H&E histology, (b) Tumor cells completely negative for expression of ER using 1D5 monoclonal antibody,  (c) 3+ immunostaining for ER expression using SP1 monoclonal antibody. d-f case #963, an infiltrating ductal carcinoma: (d) H&E histology, (e) 2+ moderate immunostaining by 1D5, (f) 3+ immunostaining for ER using SP1.         96  Figure 3.2. BCSS analysis of SP1, 1D5 and DCC. Breast cancer death specific survival (BCSS) for (a) SP1, (b) 1D5 and (c) DCC assays. The 10-yr BCSS are labeled accordingly. The linear trend across semi-quantitative factor levels for survival differences was tested by log-rank.    (a) (b)   97   Figure 3.3. BCSS analysis of SP1 and 1D5 concordant and discordant cases. Breast cancer survival functions (BCSS) of SP1 and 1D5 concordant and discordant cases (Total n = 4128). The 10-yr BCSS estimates are labeled accordingly. The SP1-/1D5+ group has a worse BCSS survival compared to the expected outcome from the rest of ER+ patients.     (c)   98 Supplemental Figure 3.1. Relapse-free survival (RFS) of SP1 and 1D5 concordant and discordant cases (Total n = 4,019). The 10-yr RFS estimates are labeled accordingly. The SP1-/1D5+ group had a reduced relapse free survival probability than the expected for ER+ patients.    Supplemental Figure 3.2 BCSS analysis of SP1, 1D5 and DCC among patients receiving no AST. Kaplan-Meier plots for cumulative breast cancer specific survival (BCSS) in 1698 patients not receiving any adjuvant systemic treatment by ER status, as assessed with SP1 (a), 1D5 (b) or the quantitative charcoal-dextran method (c). The 10-yr BCSS estimates are labeled.   (a)   99            (b) (c)   100 References:  1. McGuire WL. Breast cancer prognostic factors: evaluation guidelines. J Natl Cancer Inst 1991;83(3):154-5. 2. Clark GM. Prognostic and predictive factors. In: Harris JR, Lippman ME, Morrow M, Hellman S, eds. Diseases of the Breast. Lippincott-Raven 1996:461-485. 3. Crowe JP, Jr., Gordon NH, Hubay CA, Shenk RR, Zollinger RM, Brumberg DJ, et al. Estrogen receptor determination and long term survival of patients with carcinoma of the breast. Surg Gynecol Obstet 1991;173(4):273-8. 4. Clinical practice guidelines for the use of tumor markers in breast and colorectal cancer. Adopted on May 17, 1996 by the American Society of Clinical Oncology. J Clin Oncol 1996;14(10):2843-77. 5. Vassallo J, Pinto GA, Alvarenga JM, Zeferino LC, Chagas CA, Metze K. Comparison of immunoexpression of 2 antibodies for estrogen receptors (1D5 and 6F11) in breast carcinomas using different antigen retrieval and detection methods. Appl Immunohistochem Mol Morphol 2004;12(2):177-82. 6. Pertschuk LP, Feldman JG, Kim YD, Braithwaite L, Schneider F, Braverman AS, et al. Estrogen receptor immunocytochemistry in paraffin embedded tissues with ER1D5 predicts breast cancer endocrine response more accurately than H222Sp gamma in frozen sections or cytosol-based ligand-binding assays. Cancer 1996;77(12):2514-9. 7. Rossi S, Laurino L, Furlanetto A, Chinellato S, Orvieto E, Canal F, et al. Rabbit monoclonal antibodies: a comparative study between a novel category of immunoreagents and the corresponding mouse monoclonal antibodies. Am J Clin Pathol 2005;124(2):295-302. 8. Huang Z, Zhu W, Szekeres G, Xia H. Development of new rabbit monoclonal antibody to estrogen receptor: immunohistochemical assessment on formalin-fixed, paraffin-embedded tissue sections. Appl Immunohistochem Mol Morphol 2005;13(1):91- 5. 9. Treaba DO, Hing AW, Tse CC, Goldstein LC, Barry TS, Kandalaft P, et al. Significantly improved sensitivity for ER detection in breast cancer using a new rabbit monoclonal anti-ER antibody (SP1). Mod Pathol 2005;18:53A. 10. Shek LL, Godolphin W. Model for breast cancer survival: relative prognostic roles of axillary nodal status, TNM stage, estrogen receptor concentration, and tumor necrosis. Cancer Res 1988;48(19):5565-9. 11. Shek LL, Godolphin W. Survival with breast cancer: the importance of estrogen receptor quantity. Eur J Cancer Clin Oncol 1989;25(2):243-50. 12. Shek LL, Godolphin W, Spinelli JJ. Oestrogen receptors, nodes and stage as predictors of post-recurrence survival in 457 breast cancer patients. Br J Cancer 1987;56(6):825-9. 13. Lohrisch C, Jackson J, Jones A, Mates D, Olivotto IA. Relationship between tumor location and relapse in 6,781 women with early invasive breast cancer. J Clin Oncol 2000;18(15):2828-35. 14. Chia SK, Speers CH, Bryce CJ, Hayes MM, Olivotto IA. Ten-year outcomes in a population-based cohort of node-negative, lymphatic, and vascular invasion-negative early breast cancers without adjuvant systemic therapies. J Clin Oncol 2004;22(9):1630- 7.   101 15. Olivotto A, Coldman AJ, Hislop TG, Trevisan CH, Kula J, Goel V, et al. Compliance with practice guidelines for node-negative breast cancer. J Clin Oncol 1997;15(1):216-22. 16. Olivotto IA, Bajdik CD, Ravdin PM, Speers CH, Coldman AJ, Norris BD, et al. Population-based validation of the prognostic model ADJUVANT! for early breast cancer. J Clin Oncol 2005;23(12):2716-25. 17. Kononen J, Bubendorf L, Kallioniemi A, Barlund M, Schraml P, Leighton S, et al. Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat Med 1998;4(7):844-7. 18. Parker RL, Huntsman DG, Lesack DW, Cupples JB, Grant DR, Akbari M, et al. Assessment of interlaboratory variation in the immunohistochemical determination of estrogen receptor status using a breast cancer tissue microarray. Am J Clin Pathol 2002;117(5):723-8. 19. Ng TL, Gown AM, Barry TS, Cheang MC, Chan AK, Turbin DA, et al. Nuclear beta-catenin in mesenchymal tumors. Mod Pathol 2005;18(1):68-74. 20. Fisher ER, Anderson S, Dean S, Dabbs D, Fisher B, Siderits R, et al. Solving the dilemma of the immunohistochemical and other methods used for scoring estrogen receptor and progesterone receptor in patients with invasive breast carcinoma. Cancer 2005;103(1):164-73. 21. Kaplan E, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457-381. 22. Cox DR. Regression models and life tables. Journal of the Royal Statistical Society B34 1972:187-220. 23. Mantel N. Evaluation of survival data and two new rank order staistics arising in its consideration. Cancer Chemotherapy Reports 1966;50:163-170. 24. Cox DR. Partial likelihood. Biometrika 1975;62:269-276. 25. Grambsch P, Therneau T. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994;81(3):515-26. 26. Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 1960;20:27-46. 27. Kendall M. Rank Correlation Methods. 4th ed. London: Griffin; 1970. 28. Chernoff H, Lehmann EL. The use of maximum likelihood estimates in !2 tests for goodness-of-fit. The Annals of Mathematical Statistics 1954;25:576-586. 29. Mann HB, Whitney DR. On a test of whether one of 2 random variables is stochastically larger than the other. Annals of Mathematical Statistics 1947;18:50-60. 30. Hess KR, Pusztai L, Buzdar AU, Hortobagyi GN. Estrogen receptors and distinct patterns of breast cancer relapse. Breast Cancer Res Treat 2003;78(1):105-18. 31. Early Breast Cancer Trialists' Collaborative Group. Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005;365(9472):1687-717. 32. Harvey JM, Clark GM, Osborne CK, Allred DC. Estrogen receptor status by immunohistochemistry is superior to the ligand-binding assay for predicting response to adjuvant endocrine therapy in breast cancer. J Clin Oncol 1999;17(5):1474-81. 33. Kaplan PA, Frazier SR, Loy TS, Diaz-Arias AA, Bradley K, Bickel JT. 1D5 and 6F11: An immunohistochemical comparison of two monoclonal antibodies for the   102 evaluation of estrogen receptor status in primary breast carcinoma. Am J Clin Pathol 2005;123(2):276-80. 34. Bevitt DJ, Milton ID, Piggot N, Henry L, Carter MJ, Toms GL, et al. New monoclonal antibodies to oestrogen and progesterone receptors effective for paraffin section immunohistochemistry. J Pathol 1997;183(2):228-32. 35. McCabe A, Dolled-Filhart M, Camp RL, Rimm DL. Automated Quantitative Analysis (AQUA) of In Situ Protein Expression, Antibody Concentration, and Prognosis. Journal of the National Cancer Institute 2005;97(24):1808-15. 36. Camp RL, Chung GG, Rimm DL. Automated subcellular localization and quantification of protein expression in tissue microarrays. Nat Med 2002;8(11):1323-7. 37. Berry DA, Cirrincione C, Henderson IC, Citron ML, Budman DR, Goldstein LJ, et al. Estrogen-receptor status and outcomes of modern chemotherapy for patients with node-positive breast cancer. Jama 2006;295(14):1658-67.       103    CHAPTER FOUR  GATA-3 EXPRESSION IN BREAST CANCER HAS A STRONG ASSOCIATION WITH ESTROGEN RECEPTOR BUT LACKS INDEPENDENT PROGNOSTIC VALUE 3     In gene expression profiling data, GATA3 mRNA expression, encoding a nuclear transcription factor protein, is commonly seen in ER+/Luminal A tumors. We therefore hypothesized that GATA3 was a potential Luminal A biomarker that might sub-classify the ER positive breast tumours into good and bad prognosis groups. We first tested antibodies recognizing GATA3 on clinically annotated tissue microarrays constructed using an independent cohort of 438 invasive breast tumors. Next using the tissue microarrays representing over 4000 invasive breast tumors, we aimed to validate the prognostic values of GATA3 as a univariate biomarker in general population, as well as among ER positive tumors.  Reprinted with the permission of Cancer Epidemiology, Biomarkers & Prevention, American Association of Cancer Research.  3  A version of this chapter was published. Voduc D, Cheang, MCU, Nielsen TO. GATA- 3 expression in breast cancer has a strong association with estrogen receptor but lacks independent prognostic value. Cancer Epidemiology Biomarkers and Prevention 2008; 17(2):365-73.  104 INTRODUCTION   Estrogen receptor (ER) status in breast cancer is used to estimate prognosis and to guide systemic treatment.  ER positive status is generally associated with a more favorable prognosis(1, 2), but more importantly it is a predictive marker of response to hormonal therapies, such as tamoxifen and aromatase inhibitors (3, 4).  However, ER positive tumors represent a large and heterogeneous subgroup, and treatment decisions are most often based on clinicopathological features such as tumor grade and lymph node status.  Novel biomarkers can be used to further refine prognostic models, and some may be useful to predict response to adjuvant treatment.   Gene expression profiling studies have shown a consistent separation of ER positive tumors into Luminal A and Luminal B subtypes, with Luminal B tumors having significantly worse outcome(5-7).  In addition, from studies of tamoxifen in advanced breast cancer, it is estimated that up to 30-50% of ER positive tumors are not tamoxifen- responsive(8, 9).  It would be of great clinical significance if Luminal subtype and response to tamoxifen could be determined with inexpensive and readily available biomarker laboratory tests, such as immunohistochemistry.   The proteins of the GATA family are zinc-finger transcription factors involved in embryogenesis and development.  GATA-3, in particular, plays a role in placental development, hematopoiesis and adipogenesis(10-12).  In breast epithelial cells, it has been proposed that GATA-3 acts to maintain a differentiated state.  Aberrant expression  105 of GATA-3 has been reported in breast cancer, as well as in pancreatic and cervical cancers(12-14).   Studies using both gene expression profiling and immunohistochemistry have demonstrated that GATA-3 expression in breast cancer is closely associated with the estrogen receptor (6, 15, 16).  In another study, GATA-3 gene constructs were transfected into breast cancer cell lines, and it was found that many GATA-3 induced genes were also in the luminal gene cluster (17).  In a recent tissue microarray-based study (Mehra et al (16)), GATA-3 protein was found to have independent prognostic value in a cohort of 139 breast cancer patients.  More specifically, low expression of GATA-3 predicted for breast cancer relapse.  They also concluded that low GATA-3 levels could identify a subgroup of ER positive tumors with a greater risk of relapse. This work suggests that GATA-3 may be useful in distinguishing Luminal A and Luminal B subtype tumors. Herein we attempt to confirm these findings with the immunohistochemical analysis of GATA-3 in a much larger cohort of breast cancer patients, with adjuvant treatment data and long follow-up.  MATERIALS AND METHODS Study Population   Our original study cohort consisted of 4444 breast cancer patients diagnosed between January 1986 and September 1992.  This represents 34% of all patients diagnosed with breast cancer in the province of British Columbia during this time period.  106 This large, well characterized cohort was derived from a consecutive series of patients who were referred to the BC Cancer Agency for consultation and had tumor samples sent to a central laboratory at the Vancouver General Hospital for estrogen receptor testing. For all of these patients, we have available detailed demographic and outcome data, as well as formalin fixed paraffin embedded (FFPE) primary tumor samples for immunohistochemical analysis.  Patients with in-situ disease, recurrent disease, metastatic disease at presentation, and male breast cancer were excluded from analysis. Available clinical information includes age, histology, tumor grade, tumor size, lymph node status, type of local and adjuvant systemic therapy, and dates of first recurrence and death.  A portion of this cohort of patients was recently used in a population study validating the on-line breast cancer prognostic calculator ADJUVANT! Online(18).  This study was approved by the Clinical Research Ethics Board of the University of British Columbia and the BC Cancer Agency.  Tissue Microarrays and Immunohistochemistry   The Vancouver Hospital Estrogen Receptor laboratory retained single archival tumor blocks from each case in this patient cohort.  The material had been frozen prior to neutral buffered formalin fixation.  Hematoxylin and eosin stained slides from these blocks were reviewed by two pathologists to identify areas of invasive breast cancer.  0.6 mm cores were extracted from the tumor blocks and used to construct a tissue microarray (TMA) as previously described (19, 20).   107  Using a single core per case, 17 TMA blocks were required to represent the series. 4 micrometer sections were stained for GATA-3 using the Ventana Systems Discovery XT automated immunostainer.  Slides were deparaffinized and incubated with EDTA buffer (pH 8.0) for antigen retrieval.  The slides were incubated with anti-GATA-3 antibodies for 32 minutes  (Santa Cruz Biotechnology Inc., HG3-31 mouse monoclonal antibody, 1:20), and then secondary antibody (Ventana universal secondary antibody) for an additional 32 minutes.  Slides were counterstained with hematoxylin, rinsed with soap solution, and dehydrated through graded ethanol.  The slides were then cleared in xylene and coverslipped with Cytoseal XYL (Richard-Allan Scientific).  Stained TMA slides were digitally scanned and linked to a relational database, and the primary image data is available for public review (www.gpecimage.ubc.ca; to access prior to publication of this manuscript please use the username “gata3” and the password “gata3”).   Scoring of GATA-3 immunostaining was semiquantitative, using digital images (Figure 4.1).  The scoring system used by Mehra et al was based on a combination of percent positive nuclei and staining intensity, but cases were subsequently binarized into low and high GATA-3 expression.  For ease of reproducibility, we limited interpretation to the percentage of positive nuclei and applied the scoring system published by van de Rijn et al (21). Staining was considered negative (0) if <5% of nuclei were stained above background, moderate (1+) if 5-20% of nuclei were stained, and strong (2+) if >20% were stained.  For statistical analysis, we dichotomized GATA-3 staining into negative (0) and positive (1+ or 2+).  We also excluded cases for which it was not possible to  108 assign a score to the immunostaining (insufficient invasive tumor in the core, tissue core cut through, or tissue disc lost or folded during sectioning).  Statistical Analysis   Statistical analysis was performed using SPSS 14.0.  In univariate analysis, relapse free survival (RFS), breast cancer specific survival (BCSS), distant relapse free survival (DRFS), and overall survival (OS) were estimated using Kaplan-Meier curves. Significant differences in survival were assessed with Breslow tests.  For OS, any death was considered an event; and patients were censored at the time of last follow-up.  For RFS, an event was defined as any breast cancer relapse (locoregional or distant), and patients were censored at the date of death or the date of last follow-up.  For DRFS, only distant relapses were considered events.  For BCSS, an event was defined as a breast cancer death; and patients were censored at the time of a non-breast cancer death or at the date of last follow-up.  For all outcomes, survival time was calculated from the date of surgery to the date of an event or date that the patient was censored.  6 patients with unknown cause of death were excluded from RFS, BCSS, and DRFS analysis (but they were included in OS analysis).  Cox proportional hazards models were used to calculate adjusted hazard ratios accounting for covariates.  Pearson chi-square and the Mann- Whitney tests were used to measure the association of GATA-3 status with common pathological variables.  All statistical tests were two-sided.  The intent of this study was to validate pre-existing hypotheses on GATA-3 using a much larger cohort of patients; in addition, we performed relatively few statistical tests on a large number of patients with  109 long-term follow-up. Consequently, we did not apply corrections for multiple comparisons.  The p-value for statistical significance in this study is 0.05.  Missing GATA-3 Data   There were a relatively large proportion of cases with missing GATA-3 data, attributable in part to the practical necessity of using a single core per case in this very large TMA series.  Variations in the thickness of the source blocks led to cut-though and loss of some individual cores, and some were also lost during staining and sectioning. To ensure accurate scoring, we did not assign a GATA-3 score for cores containing less than 50 definite invasive cancer cells.  We did perform GATA-3 staining on a smaller tissue microarray (N = 413) with duplicate cores, and found 90% concordance between duplicate cores.   With the exclusions stated previously (in-situ disease, male breast cancer, and patients presenting with recurrent or metastatic disease), 4049 patients remained in our patient cohort.  930 (23%) TMA cores were considered uninterpretable for GATA-3. The size of the cohort with available GATA-3 data was 3119.   Missing GATA-3 data points were not significantly associated with age, lymph node status, ER status, or Her2 status.  There was a statistically significant association between missing GATA-3 data and Grade 1/2 tumors and tumors greater than 5 cm; however, the absolute differences were small.  In the whole patient cohort, the proportion  110 of cases with missing GATA-3 data was 23%.  In Grade 1/2 tumors the proportion was 25%, versus 20% in Grade 3 tumors (Pearson Chi-square test p = 3.0E-04).  The proportion of missing GATA-3 data was 32% in tumors greater than 5 cm, 20% in tumors 2 to 5 cm, and 24% in tumors less than 2 cm (Pearson Chi-square test p = 3.7 E-05).  In consideration of these results, it is reasonable to conclude that there is a bias in our methodology leading to very large tumors being under-represented.  It is possible that areas of necrosis observed in large tumors lead to sampling error during core extraction and TMA construction.  However, it should be noted that tumors greater than 5 cm represent only 5.5% of our whole study population.  Although these large tumors are associated with an inferior prognosis, we found no difference in survival between patients with missing GATA-3 data and those with GATA-3 data (BCSS @ 10 years, 74% vs. 74%, Breslow test p = 0.57).  RESULTS Patient Demographics and Pathological Data   In our cohort of 3119 patients, the mean age at diagnosis was 59 years and the median follow-up for breast cancer specific survival is 12.6 years (follow-up is defined as the time of diagnosis to time of event or last follow-up). The range for follow-up is 1 month to 18.5 years.  The median tumor size was 2.5 cm.  53% of patients had Grade 3 tumors, 44% were node positive, and 78% were ER positive (Table 4.1).  42% had breast conserving surgery and 58% had mastectomy; adjuvant radiation therapy was given to 55%.  58% received adjuvant systemic therapy (either chemotherapy or hormonal  111 therapy).  Outcome was last updated June 30, 2004, and to that date there were a total of 914 (29%) breast cancer deaths, 1202 (39%) relapses, 1043 (33%) distant relapses, and 1556 (50%) total deaths.  GATA-3 Immunostaining   Of 3119 breast cancer tumors examined with interpretable GATA-3 staining, 2140 (68%) cases were GATA-3 negative, 646 (21%) exhibited moderate staining, and 333 (11%) were strongly positive.  Once GATA-3 data was dichotomized, 68% of cases were GATA-3 negative and 32% GATA-3 positive.  Associations with Known Pathological Factors   In our patient cohort, the common clinicopathological variables including: patient age at diagnosis, tumor size, histologic grade, nodal status, estrogen receptor status, and Her2 status were all statistically significant predictors of breast cancer specific survival (Table 4.2a), relapse free survival (Table 4.2b), distant relapse free survival (Table 4.2c), and overall survival (Table 4.2d) on univariate analysis.   There was a strong association between GATA-3 protein expression and positive estrogen receptors (Pearson Chi-square test p = 2.1E-67).  78% of this TMA consists of ER positive cases, and among the ER positive cases 39% were also GATA-3 positive.  In  112 contrast, among the ER negative cases, only 2.6% were GATA-3 positive.  Overall 98% of GATA-3 positive cases were also ER positive.   There was a linear association between age at diagnosis and GATA-3 expression (Mann-Whitney test p = 3.6E-05), with GATA-3 present in a greater proportion of older patients.  However, the actual difference was small; the mean age in GATA-3 positive cases was 60 years versus 58 years in GATA-3 negative cases.  There was also a linear association between GATA-3 expression and smaller tumor size (Mann-Whitney test p = 0.002).  Again, the actual difference was small.  The mean tumor size in GATA-3 positive cases was 2.3 cm versus 2.6 cm in GATA-3 negative cases.   There was a strong association between GATA-3 positive cases and Grade 1/2 tumors (Pearson Chi-square test p = 1.1E-16), and absence of HER2 overexpression (Pearson Chi-square test p = 9.7E-13).  We did not find a significant association between GATA-3 and lymph node status (Pearson Chi-square test p = 0.47).  Univariable Survival Analysis   In this study, we performed survival analysis for multiple outcomes; however, the results from GATA-3 survival analysis were consistent throughout all outcomes analyzed.  In this patient cohort, the presence of GATA-3 was a statistically significant marker of good prognosis for all outcomes including BCSS, RFS, DRFS, and OS (Figure 4.2).  The BCSS at 10 years was 78% for GATA-3 positive cases vs. 72% for negative  113 cases (Breslow test p = 9.6E-05).  The difference in 10-year RFS, DRFS, and OS was 3%, 4%, and 3% respectively. But within the ER positive subgroup, GATA-3 did not have statistically significant prognostic value for any of these outcomes (Figure 4.3). The BCSS at 10 years was 78% vs. 76% (p = 0.26).  Our results demonstrate a difference in BCSS of +1.9% for GATA-3 positive cases, with the 95% confidence interval between +5.5% and -1.7%.  The difference in 10-year RFS, DRFS, and OS in the ER positive subgroup was 1%, 2%, and 1% respectively.   We also analyzed the prognostic significance of GATA-3 in the patient subgroup that was ER positive and did not receive adjuvant tamoxifen, and in the ER positive subgroup that was treated with tamoxifen. In the no adjuvant tamoxifen subgroup, GATA-3 was not prognostic for any outcome (i.e. BCSS @ 10 years 85% vs. 85%, p = 0.55).  Similarly, in the subgroup treated with adjuvant tamoxifen, GATA-3 was again not prognostic for all outcomes (i.e. BCSS @ 10 years 76% vs. 71%, p = 0.10).  Multivariable Survival Analysis   Using a Cox proportional hazards model including all patients, tumor size, histologic grade, Her2 status, and nodal status were all independent predictors of BCSS (Table 4.3a), RFS (Table 4.3b), DRFS (Table 4.3c), and OS (Table 4.3d).  In comparison to baseline, certain age groups were prognostic for BCSS, RFS, and OS.  ER status was independently prognostic only for BCSS.  GATA-3 was not independently prognostic for any of the outcomes measured.  We repeated the Cox proportional hazards  114 model for the ER positive subgroup, and the ER positive subgroups treated with and without adjuvant tamoxifen.  The results were similar to those of the whole cohort, with GATA-3 not independently prognostic in any subgroup, for any outcome.  DISCUSSION   GATA-3 is a highly conserved protein that plays a critical role in development and cellular differentiation(10, 12).  Usary et al performed mutational analysis of GATA- 3 in human breast tumors and proposed that GATA-3 is involved in luminal differentiation.  It was also suggested that high expression of GATA-3 in breast luminal cells is “normal,” and a loss of GATA-3 expression may contribute to tumorigenesis.   The association between GATA-3 and estrogen receptor was previously recognized using hierarchal clustering analysis of gene expression data from 34 primary breast carcinomas (22) and subsequently confirmed on a separate study of 78 breast cancer tumors(6).    This study from Sorlie et al found an ER gene cluster including estrogen receptor alpha and GATA-3; also in this gene cluster were X-box binding protein 1, trefoil factor 3, hepatocyte nuclear factor 3 alpha, and LIV-1.  Expression levels of this gene cluster separated the Luminal/ER positive tumors into subgroups.  The good prognosis Luminal A group exhibited the highest expression of the ER gene cluster, while the Luminal B/C group had low/moderate expression and was associated with a worse prognosis.   115  Mehra et al. performed a meta-analysis of four breast cancer microarray expression profile data sets (including the previously mentioned data set from Sorlie et al.) totaling 305 breast tumor samples (16).  GATA-3 mRNA levels were prognostic for relapse free survival, with higher levels associated with favorable outcome.  GATA-3 mRNA levels were lower in ER-negative and high grade tumors.   In addition to the meta-analysis, these researchers constructed a tissue microarray from 139 consecutive single institution breast cancer patients and GATA-3 levels were assessed with immunohistochemistry.  Their GATA-3 score was based on both intensity of staining and percentage of nuclei stained, but a binarized score was used for analysis. Mehra et al found that low levels of GATA-3 were associated with larger tumor size, positive lymph node status, higher grade, ER positive status, and Her2 negative status. Overall, high GATA-3 expression was a marker of good prognosis for breast cancer relapse free survival and overall survival; and in multivariate analysis, GATA-3 independently predicted for superior relapse free survival.   Using an easily reproducible system for scoring GATA-3 immunostaining, our study of GATA-3 protein expression in a much larger cohort of breast cancer tumors confirms some of these previously published conclusions.  We found that GATA-3 is almost exclusively expressed in association with ER (p =2.1E-67), with 98% of GATA-3 positive cases also being ER positive.  In our study cohort, among ER positive cases 39% are GATA-3 positive.  This number is similar to the results of Mehra et al., who found that 46% of ER positive cases were GATA-3 positive.  In addition we also confirm that  116 GATA-3 is associated with favorable prognostic features including older age at diagnosis, lower histologic grade, and Her2 negative status.  However, in our study GATA-3 levels were not significantly associated with lymph node status, and this is discordant with the results of Mehra et al.   In univariate analysis of the entire cohort, we found that the presence of GATA-3 is a relatively weak marker of good prognosis for all outcomes including OS and RFS (the outcomes presented by Mehra et al.).  This finding is consistent with the close association between GATA-3 and ER (ER is a stronger marker of good prognosis). However, Mehra et al. reported that high GATA-3 protein expression predicted for superior RFS in univariate analysis of the ER positive subgroup and in multivariate analysis of their whole cohort.  In our study GATA-3 was not significantly prognostic in the ER positive subgroup for any of the survival outcomes reported, nor did it have independent prognostic significance in our multivariate models.  Because of its lack of prognostic significance within the ER positive subgroup, immunohistochemical assay of GATA-3 is unlikely to be useful in distinguishing the Luminal A from Luminal B biological subtypes.  In the ER patients, our results demonstrate a difference in 10-year breast cancer specific survival of 2% between GATA-3 positive and GATA-3 negative cases, and the 95% confidence interval excludes an absolute difference of greater than 5.5%.  Thus, GATA-3 is not a clinically useful prognostic marker in breast cancer given that ER status is already routinely obtained.   117  It has also been hypothesized that GATA-3 could be a predictive biomarker for tamoxifen responsiveness in ER positive tumors(23).  Parikh et al performed GATA-3 immunohistochemistry on 14 tumor samples from patients determined to be hormone unresponsive (based on clinical progression or early recurrence on hormonal therapy). They found that the hormone unresponsive patients were more likely to have low expression of GATA-3 compared to the 14 hormone responsive controls.  Their conclusion that GATA-3 is predictive of tamoxifen response is not supported by our study.  We performed Kaplan-Meier analyses of GATA-3 expression among ER positive patients receiving adjuvant tamoxifen, and in ER positive patients not receiving any adjuvant systemic therapy.  Although evidence of predictive effect in a biomarker is best demonstrated in the context of a randomized clinical trial, we can make some useful observations in our cohort.  If GATA-3 status were predictive of tamoxifen response, then we would expect to see GATA-3 having a reasonably strong prognostic effect in the ER positive group treated with tamoxifen.  Similarly, this prognostic effect would be reduced, or absent, in the ER positive group that did not receive tamoxifen.  However, in both the tamoxifen treated and untreated subgroups, GATA-3 did not have prognostic value for any of the outcomes presented in this study.   Consequently, there is no evidence from our study that GATA-3 is a predictive biomarker for response to hormonal therapies.  Consistent with our finding is the exclusion of GATA-3 from the final 21 gene panel used to calculate the Recurrence Score, a multigene molecular test that is independently prognostic in a large series of node negative, ER positive, and tamoxifen treated breast cancer patients(24).   118 CONCLUSION   Using a large, well-annotated tissue microarray series, a clinically practical immunohistochemical assessment, and strict statistical analysis, we find that GATA-3 is a breast cancer marker almost exclusively expressed among estrogen receptor positive tumors.  Similar to the estrogen receptor, it is associated with favorable prognostic features and is a univariate marker of good prognosis across multiple survival outcomes, including relapse, breast cancer death, and overall survival.  GATA-3 does not have independent prognostic significance in multivariate Cox models incorporating the standard clinicopathological variables.  It is not prognostic, for any outcome, within the ER positive subgroup and does not appear to predict for tamoxifen response in ER positive patients.    119 Table 4.1. Summary of Clinicopathological Variables   Clinicopathological Variables  Total Percent (%) Total  3119 < 40 234 7 40 – 49 662 21 50 – 65 1112 36 Age (years) > 65 1111 36 < 2 1586 51 2 – 5 1359 44 > 5 149 5 Size (cm) Unknown 25 <1 1 or 2 1341 43 3 1649 53 Grade Unknown 129 4 Negative 1741 56 Positive 1369 44 Nodal Status Unknown 9 <1 Negative 618 20 Positive 2421 78 ER Status Unknown 80 3 Negative 2627 84 Positive 413 13 Her2 Status Unknown 79 3   120 Table 4.2. Univariable Hazard Ratios for Standard Clinicopathological variables. Comparison of univariate hazard ratios using (a) breast cancer specific survival (BCSS), (b) relapse free survival (RFS), (c) distant relapse free survival (DRFS), and (d) overall survival (OS) for the standard clinicopathological variables in this study  (a)  Variable Difference in BCSS @ 10 years Breslow Test p Value Univariate Hazard Ratio 40 – 49 vs. <40 75% vs. 62% 0.66 50 – 65 vs. <40 75% vs. 62% 0.68 Age (Yrs) >65 vs. <40 75% vs. 62% 5.6E-04 0.69 2-5 vs. <2 66% vs. 83% 1.94 Size (cm) >5 vs. <2 52% vs. 83% 8.2E-34 3.19 Grade Gr. 3 vs. Gr. 1/2 67% vs. 83% 5.4E-27 1.98 Nodal Status Pos. vs. Neg 61% vs. 84% 3.7E-51 2.69 ER Status Pos. vs. Neg 77% vs. 63% 2.3E-17 0.63 Her2 Status Pos. vs. Neg 60% vs. 76% 1.0E-15 1.76  (b)  Variable Difference in RFS @ 10 years Breslow Test p Value Univariate Hazard Ratio 40 – 49 vs. <40 62% vs. 53% 0.72 50 – 65 vs. <40 64% vs. 53% 0.68 Age (Yrs) >65 vs. <40 64% vs. 53% 3.4E-05 0.65 2-5 vs. <2 57% vs. 70% 1.59 Size (cm) >5 vs. <2 42% vs. 70% 6.1E-25 2.53 Grade Gr. 3 vs. Gr. 1/2 56% vs. 71% 3.1E-23 1.70 Nodal Status Pos. vs. Neg 51% vs. 73% 5.8E-42 2.19 ER Status Pos. vs. Neg 65% vs. 56% 2.2E-12 0.73 Her2 Status Pos. vs. Neg 51% vs. 65% 6.5E-13 1.56   121 (c)  Variable Difference in BCSS @ 10 years Breslow Test p Value Univariate Hazard Ratio 40 – 49 vs. <40 69% vs. 57% 0.67 50 – 65 vs. <40 68% vs. 57% 0.68 Age (Yrs) >65 vs. <40 69% vs. 57% 3.9E-04 0.66 2-5 vs. <2 60% vs. 77% 1.90 Size (cm) >5 vs. <2 46% vs. 77% 2.6E-33 2.92 Grade Gr. 3 vs. Gr. 1/2 61% vs. 77% 5.8E-26 1.86 Nodal Status Pos. vs. Neg 53% vs. 79% 5.8E-58 2.71 ER Status Pos. vs. Neg 70% vs. 60% 3.9E-12 0.70 Her2 Status Pos. vs. Neg 55% vs. 70% 2.0E-24 1.66  (d)  Variable Difference in RFS @ 10 years Breslow Test p Value Univariate Hazard Ratio 40 – 49 vs. <40 73% vs. 61% 0.68 50 – 65 vs. <40 68% vs. 61% 0.94 Age (Yrs) >65 vs. <40 51% vs. 61% 4.9E-29 1.63 2-5 vs. <2 56% vs. 71% 1.60 Size (cm) >5 vs. <2 46% vs. 71% 4.7E-26 1.98 Grade Gr. 3 vs. Gr. 1/2 56% vs. 71% 1.1E-19 1.48 Nodal Status Pos. vs. Neg 52% vs. 71% 1.3E-38 1.90 ER Status Pos. vs. Neg 64% vs. 56% 1.2E-7 0.84 Her2 Status Pos. vs. Neg 51% vs. 64% 1.7E-9 1.39   122 Table 4.3. Cox PHD Model of GATA-3. Cox proportional hazard regression analysis for (a) breast cancer specific survival (BCSS), (b) relapse free survival (RFS), (c) distant relapse free survival (DRFS), and (d) overall survival (OS)  (a) BCSS   Whole Cohort ER Positive Only ER Positive + Tamoxifen ER Positive - Tamoxifen Sample Size 3114 2416 917 954  Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Age at Diagnosis (Yrs) 40-49 vs. < 40 0.76 p = 0.035 0.72 p = 0.057 NA* 1.65 p = 0.41 50-65 vs. < 40 0.84 p = 0.16 0.82 p = 0.23 NA 2.04 p = 0.23 > 65 vs. < 40 0.93 p = 0.54 0.88 p = 0.44 NA  1.75 p= 0.35 Tumor Size (cm) 2 < size ! 5 vs. ! 2 1.60 p = 5.7E-10 1.65 p = 1.4E-8 1.57 p = 7.1E-4 1.65 p = 3.5E-5 > 5 cm vs. ! 2 cm 2.25 p = 2.9E-8 3.13 p = 2.1E-12 4.58 p = 9.8E-9 0.75 p = 0.78 Grade Grade 3 vs. Grade 1/2 1.54 p = 2.0E-9 1.50 p = 1.8E-6 1.52 p = 1.3E-3 1.44 p = 0.025 Nodal Status Positive vs. Negative 2.35 p = 5.6E-31 2.35 p = 1.4E-22 1.57 p = 1.1E-6 5.15 p = 7.0E-10 ER Status Positive vs. Negative 0.79 p = 0.011 NA NA NA Her2 Status Positive vs. Negative 1.36 p = 0.0010 1.59 p = 1.7E-4 1.76 p = 0.0016 1.80 p = 0.023 GATA-3 Status Positive vs. Negative 1.01 p = 0.90 1.03 p=0.69 0.90 p = 0.44 1.29 p = 0.12  * Hazard Ratio not available for age groups in the ER Positive + Tamoxifen subgroup because there were no patients diagnosed at age <40 years in this subgroup.  123 (b) RFS   Whole Cohort ER Positive Only ER Positive + Tamoxifen ER Positive - Tamoxifen Sample Size 3114 2416 917 954  Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Age at Diagnosis (Yrs) 40-49 vs. < 40 0.81 p = 0.075 0.75 p = 0.055 NA* 1.08 p = 0.84 50-65 vs. < 40 0.80 p = 0.045 0.73 p = 0.029 NA 1.08 p = 0.83 > 65 vs. < 40 0.82 p = 0.082 0.70 p = 0.016 NA  0.89 p= 0.74 Tumor Size (cm) 2 < size ! 5 vs. ! 2 1.37 p = 1.1E-6 1.40 p = 7.1E-6 1.49 p = 8.9E-4 1.58 p = 5.3E-4 > 5 cm vs. ! 2 cm 1.91 p = 2.3E-7 2.34 p = 1.5E-8 4.30 p = 3.8E-9 1.27 p = 0.68 Grade Grade 3 vs. Grade 1/2 1.41 p = 3.4E-7 1.38 p = 1.0E-5 1.49 p = 4.8E-4 1.19 p = 0.17 Nodal Status Positive vs. Negative 1.97 p = 8.2E-27 1.92 p = 2.8E-19 2.14 p = 2.2E-8 4.35 p = 7.8E-11 ER Status Positive vs. Negative 0.90 p = 0.18 NA NA NA Her2 Status Positive vs. Negative 1.31 p = 0.0016 1.40 p = 2.5E-3 1.69 p = 0.0015 1.33 p = 0.20 GATA-3 Status Positive vs. Negative 1.01 p = 0.85 1.04 p=0.63 0.97 p = 0.81 1.09 p = 0.48  * Hazard Ratio not available for age groups in the ER Positive + Tamoxifen subgroup because there were no patients diagnosed at age <40 years in this subgroup.  124 (c) DRFS   Whole Cohort ER Positive Only ER Positive + Tamoxifen ER Positive - Tamoxifen Sample Size 3114 2416 917 954  Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Age at Diagnosis (Yrs) 40-49 vs. < 40 0.78 p = 0.052 0.71 p = 0.036 NA* 1.06 p = 0.93 50-65 vs. < 40 0.85 p = 0.17 0.79 p = 0.13 NA 1.30 p = 0.57 > 65 vs. < 40 0.88 p = 0.29 0.78 p = 0.12 NA  1.07 p= 0.88 Tumor Size (cm) 2 < size ! 5 vs. ! 2 1.58 p = 1.1E-10 1.63 p = 2.7E-9 1.65 p = 5.6E-5 1.90 p = 2.2E-5 > 5 cm vs. ! 2 cm 2.13 p = 8.2E-9 2.84 p = 2.0E-11 4.45 p = 2.9E-9 0.59 p = 0.60 Grade Grade 3 vs. Grade 1/2 1.48 p = 5.8E-8 1.46 p = 1.7E-6 1.46 p = 0.0020 1.43 p = 0.017 Nodal Status Positive vs. Negative 2.39 p = 4.9E-36 2.34 p = 6.0E-26 2.18 p = 4.6E-8 5.23 p = 2.0E-11 ER Status Positive vs. Negative 0.88 p = 0.13 NA NA NA Her2 Status Positive vs. Negative 1.34 p = 0.0011 1.44 p = 0.0020 1.68 p = 0.0021 1.53 p = 0.087 GATA-3 Status Positive vs. Negative 1.00 p = 0.98 1.02 p=0.84 0.918 p = 0.48 1.08 p = 0.62  * Hazard Ratio not available for age groups in the ER Positive + Tamoxifen subgroup because there were no patients diagnosed at age <40 years in this subgroup.  125 (d) OS   Whole Cohort ER Positive Only ER Positive + Tamoxifen ER Positive - Tamoxifen Sample Size 3119 2421 919 957  Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Multivariate Hazard Ratio Age at Diagnosis (Yrs) 40-49 vs. < 40 0.74 p = 0.014 0.69 p = 0.025 NA* 1.17 p = 0.74 50-65 vs. < 40 1.07 p = 0.53 1.09 p = 0.57 NA 2.36 p = 0.060 > 65 vs. < 40 1.97 p = 9.1E-10 2.00 p = 4.2E-6 NA  4.83 p = 5.2E-4 Tumor Size (cm) 2 < size ! 5 vs. ! 2 1.42 p = 5.6E-10 1.45 p = 5.8E-9 1.45 p = 8.2E-5 1.60 p = 9.2E-6 > 5 cm vs. ! 2 cm 1.81 p = 4.5E-7 2.49 p = 5.6E-11 3.15 p = 2.2E-7 1.25 p = 0.71 Grade Grade 3 vs. Grade 1/2 1.33 p = 6.9E-7 1.29 p = 4.5E-5 1.24 p = 0.019 1.16 p = 0.15 Nodal Status Positive vs. Negative 1.77 p = 1.9E-25 1.75 p = 3.1E-19 1.79 p = 3.7E-8 3.25 p = 2.6E-9 ER Status Positive vs. Negative 0.88 p = 0.075 NA NA NA Her2 Status Positive vs. Negative 1.27 p = 0.0023 1.45 p = 1.3E-4 1.54 p = 0.0026 1.44 p = 0.041 GATA-3 Status Positive vs. Negative 0.99 p = 0.82 1.00 p=0.97 0.92 p = 0.40 1.12 p = 0.29  * Hazard Ratio not available for age groups in the ER Positive + Tamoxifen subgroup because there were no patients diagnosed at age <40 years in this subgroup.   126 Figure 4.1 GATA-3 Immunostains. GATA-3 immunostaining was scored semiquantitatively; tumors with <5% nuclei stained were considered GATA-3 negative (a), tumors with 5-20% nuclei stained were moderately positive (b), and tumors with >20% nuclei stained were strongly positive (c).   (a) – GATA-3 Negative     (b) – GATA-3 Moderately Positive     (c) – GATA-3 Strongly Positive    127 Figure 4.2. Survival Analysis of GATA-3 in Whole Cohort. GATA-3 positive vs. negative Kaplan-Meier plots in the whole patient cohort, for the outcomes (a) breast cancer specific survival (BCSS), (b) relapse free survival (DFS), (c) distant relapse free survival (DRFS), and (d) overall survival (OS).  In the whole patient cohort, and for all outcomes, the presence of GATA-3 is a weaker marker of good prognosis.  The difference in 10-year BCSS is 78% vs. 72% (a, p = 9.6E-05), 10-year RFS is 65% vs. 62% (b, p = 2.5E-03), 10-year DRFS is 71% vs. 67% (c, p = 0.0016, and 10-year OS is 65% vs. 62% (d, p = 0.0039).  (a) – BCSS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in all patients   (b) – RFS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in all patients  GATA-3 Positive BCSS @ 10 years = 78% GATA-3 Negative BCSS @ 10 years = 72% P = 9.6E-05 GATA-3 Positive RFS @ 10 years = 65% GATA-3 Negative RFS @ 10 years = 62% P = 2.5E-03  128 (c) – DRFS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in all patients    (c) – OS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in all patients  GATA-3 Positive DRFS @ 10 years = 71% GATA-3 Negative DRFS @ 10 years = 67% P = 0.0016 GATA-3 Positive OS @ 10 years = 65% GATA-3 Negative OS @ 10 years = 62% P = 0.0039  129 Figure 4.3. Survival Analysis of GATA-3 in ER Positive Breast Tumors. GATA-3 positive vs. negative Kaplan-Meier plots in the ER-positive subgroup, for the outcomes (a) breast cancer specific survival (BCSS), (b) relapse free survival (DFS), (c) distant relapse free survival (DRFS), and (d) overall survival (OS).  In the ER positive subgroup, and for all outcomes, GATA-3 is not a marker of prognosis.  The difference in 10-year BCSS is 78% vs. 76% (a, p = 0.26), 10-year RFS is 65% vs. 64% (b, p = 0.55), 10-year DRFS is 71% vs. 69% (c, p = 0.34), and 10-year OS is 65% vs. 64% (d, p = 0.30).  (a) – BCSS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in ER positive patients   (b) – RFS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in ER positive patients  GATA-3 Positive BCSS @ 10 years = 78% GATA-3 Negative BCSS @ 10 years = 76% P = 0.26 GATA-3 Positive RFS @ 10 years = 65% GATA-3 Negative RFS @ 10 years = 64% P = 0.55  130 (c) – DRFS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in ER positive patients   (d) – OS Kaplan-Meier plot of GATA-3 Positive vs. GATA-3 Negative in ER positive patients  GATA-3 Positive DRFS @ 10 years = 71% GATA-3 Negative DRFS @ 10 years = 69% P = 0.34 GATA-3 Positive OS @ 10 years = 65% GATA-3 Negative OS @ 10 years = 64% P = 0.30  131 References:  1. Fisher B, Redmond C, Fisher ER, Caplan R, Fisher B, Redmond C, et al. Relative worth of estrogen or progesterone receptor and pathologic characteristics of differentiation as indicators of prognosis in node negative breast cancer patients: findings from National Surgical Adjuvant Breast and Bowel Project Protocol B-06. Journal of Clinical Oncology 1988;6(7):1076-87. 2. Grann VR, Troxel AB, Zojwalla NJ, Jacobson JS, Hershman D, Neugut AI, et al. Hormone receptor status and survival in a population-based cohort of patients with breast carcinoma. Cancer 2005;103(11):2241-51. 3. Early Breast Cancer Trialists' Collaborative G, Early Breast Cancer Trialists' Collaborative G. Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials.[see comment]. Lancet 2005;365(9472):1687-717. 4. Grana G, Grana G. Adjuvant aromatase inhibitor therapy for early breast cancer: A review of the most recent data. Journal of Surgical Oncology 2006;93(7):585-92. 5. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 6. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proceedings of the National Academy of Sciences of the United States of America 2001;98(19):10869-74. 7. Sotiriou C, Neo SY, McShane LM, Korn EL, Long PM, Jazaeri A, et al. Breast cancer classification and prognosis based on gene expression profiles from a population- based study. Proceedings of the National Academy of Sciences of the United States of America 2003;100(18):10393-8. 8. Morgan LR, Jr., Schein PS, Woolley PV, Hoth D, Macdonald J, Lippman M, et al. Therapeutic use of tamoxifen in advanced breast cancer: correlation with biochemical parameters. Cancer Treatment Reports 1976;60(10):1437-43. 9. Osborne CK, Osborne CK. Tamoxifen in the treatment of breast cancer. New England Journal of Medicine 1998;339(22):1609-18. 10. LaVoie HA. The role of GATA in mammalian reproduction. Experimental biology and medicine (Maywood, N.J 2003;228(11):1282-90. 11. Tong Q, Tsai J, Tan G, Dalgin G, Hotamisligil GS, Tong Q, et al. Interaction between GATA and the C/EBP family of transcription factors is critical in GATA- mediated suppression of adipocyte differentiation. Molecular & Cellular Biology 2005;25(2):706-15. 12. Usary J, Llaca V, Karaca G, Presswala S, Karaca M, He X, et al. Mutation of GATA3 in human breast tumors. Oncogene 2004;23(46):7669-78. 13. Gulbinas A, Berberat PO, Dambrauskas Z, Giese T, Giese N, Autschbach F, et al. Aberrant gata-3 expression in human pancreatic cancer. The journal of histochemistry and cytochemistry 2006;54(2):161-9. 14. Steenbergen RD, OudeEngberink VE, Kramer D, Schrijnemakers HF, Verheijen RH, Meijer CJ, et al. Down-regulation of GATA-3 expression during human papillomavirus-mediated immortalization and cervical carcinogenesis. The American journal of pathology 2002;160(6):1945-51.  132 15. Charafe-Jauffret E, Ginestier C, Monville F, Finetti P, Adelaide J, Cervera N, et al. Gene expression profiling of breast cell lines identifies potential new basal markers. Oncogene 2006;25(15):2273-84. 16. Mehra R, Varambally S, Ding L, Shen R, Sabel MS, Ghosh D, et al. Identification of GATA3 as a breast cancer prognostic marker by global gene expression meta-analysis. Cancer research 2005;65(24):11259-64. 17. Oh DS, Troester MA, Usary J, Hu Z, He X, Fan C, et al. Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers.[see comment]. Journal of Clinical Oncology 2006;24(11):1656-64. 18. Olivotto IA, Bajdik CD, Ravdin PM, Speers CH, Coldman AJ, Norris BD, et al. Population-based validation of the prognostic model ADJUVANT! for early breast cancer. Journal of Clinical Oncology 2005;23(12):2716-25. 19. Kononen J, Bubendorf L, Kallioniemi A, Barlund M, Schraml P, Leighton S, et al. Tissue microarrays for high-throughput molecular profiling of tumor specimens.[see comment]. Nature Medicine 1998;4(7):844-7. 20. Parker RL, Huntsman DG, Lesack DW, Cupples JB, Grant DR, Akbari M, et al. Assessment of interlaboratory variation in the immunohistochemical determination of estrogen receptor status using a breast cancer tissue microarray. American Journal of Clinical Pathology 2002;117(5):723-8. 21. van de Rijn M, Perou CM, Tibshirani R, Haas P, Kallioniemi O, Kononen J, et al. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am J Pathol 2002;161(6):1991-6. 22. Bertucci F, Houlgatte R, Benziane A, Granjeaud S, Adelaide J, Tagett R, et al. Gene expression profiling of primary breast carcinomas using arrays of candidate genes. Human molecular genetics 2000;9(20):2981-91. 23. Parikh P, Palazzo JP, Rose LJ, Daskalakis C, Weigel RJ. GATA-3 expression as a predictor of hormone response in breast cancer. Journal of the American College of Surgeons 2005;200(5):705-10. 24. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. New England Journal of Medicine 2004;351(27):2817-26.     133    CHAPTER FIVE  KI-67 AND HER2 IDENTIFY HORMONE RECEPTOR POSITIVE LUMINAL B BREAST CANCERS WITH POOR PROGNOSIS 4     As described in Chapter 4, GATA-3 failed to validate as a useful, independent prognostic biomarker for Luminal A breast cancers, and so we sought another candidate. We found Bcl-2 to be associated with good prognosis among estrogen receptor positive tumors (Cheang et al. 2005 Annual Meeting of the Canadian Association of Pathologists (Victoria, BC, June 18-22, 2005), consistent with other reported study (Callagy et al Clin Cancer Res. 2006; 12(8):2468-75). However, Bcl-2 failed to validate as a useful prognostic factor in the series of 4000 patients. Some Luminal B tumors can be identified by their coexpression of HER2, an established biomarker in breast cancer and predict response to trastuzumab. Its importance also makes it an essential marker to be tested for each invasive breast cancer. In gene microarray data, the major distinguisher of Luminal A from Luminal B is a prominent cell proliferation signature, which is consistently high in Luminal B tumours. The proliferation signature is an important gene set encoding proteins involved in DNA replication that highly correlates with mitotic index (Perou et al. Nature 2000; 406:747-52), Ki-67 labeling index (Perou et al. PNAS 1999; 96:9212-  4  A version of this chapter was submitted to a cancer research journal. Cheang, MCU, Chia, S, Voduc D, Gao D, Leung S, Snider J, Watson M, Davies S, Bernard P, Parker J, Perou C, Ellis M, Nielsen TO. Ki-67 and HER2 Identify Hormone Receptor Positive Luminal B Breast Cancers with Poor Prognosis.  134 9217) and is shown to be a dominant expression pattern in many ER+ tumours (Dai et al. PNAS 2005; 65:4059-66; Paik et al. NEJM 2004; 351:2817-26).   We hypothesized that a high Ki-67 labeling index can identify additional Luminal B breast cancers. Ki-67 is a proliferation marker expressed exclusively in tumour nuclei. Therefore, the Ki-67 immunostains can be interpreted quantitatively for percentage of positive tumour nuclei. We first sought to determine the optimal cut point for the percentage of positive Ki-67 expression to define the Luminal B subtype by comparing gene expression profile assignments with Ki67 immunohistochemistry data from formalin-fixed paraffin embedded tissues using an independent series of 357 invasive breast cancers. Next, using our population-based series of over 4000 breast cancers, we sought to determine the prognostic values of Ki-67 and HER2 among hormone receptor positive tumours subset by the type of adjuvant systemic therapy received. This chapter has been written and will be submitted to a cancer journal in summer 2008.   135 INTRODUCTION  Breast cancer is a molecularly heterogeneous disease, composed of a spectrum of tumor subtypes with distinct clinical outcomes (1-3). Currently the choice of adjuvant systemic therapy is based on patient age, tumor size, histological grade, lymph node involvement, hormone receptor and HER2 status. The only predictive markers with associated targeted therapy in breast cancer are the estrogen receptor (ER) and the HER2 receptor. HER2 over-expressing/amplified breast cancers (~15%) currently are treated with trastuzumab, a monoclonal antibody targeted the extra-cellular domain of HER2, in combination or following adjuvant chemotherapy(4).  For the two-thirds of breast cancers that are positive for estrogen and/or progesterone (PgR) receptors, endocrine therapy (tamoxifen(5) and/or aromatase inhibitors) are generally indicated(6, 7), The systemic application of adjuvant systemic therapy has contributed to a recent decrease in the breast cancer mortality rate in Western Nations(8). However, adjuvant systemic therapy not only fails to prevent recurrence in all high-risk breast cancer cases, but there is also likely over-treatment with both chemotherapy and hormonal therapy in the lower risk node negative cohort. There is, therefore, a need to identify patients at high risk for recurrence despite current treatment options so that new therapeutic options can be targeted to those patients that actually need them, as well as identify those patients in whom hormonal therapy alone may be adequate adjuvant systemic therapy.  Ki-67, a nuclear marker of cell proliferation, has previously been studied for its prognostic role in breast cancer. Most studies have shown that tumors expressing a high level of Ki-67 are associated with worse breast cancer outcomes(9-11). However, these study cohorts involved heterogeneously treated patient populations. Ki67 is not currently  136 included in routine clinical decision-making because of lack of clarity regarding how Ki67 measurements should influence clinical decisions. Recent studies suggest that measuring changes in Ki-67 after the initiation of endocrine treatment in the neoadjuvant setting may predict long-term outcome (12, 13).  Gene expression studies have identified five molecularly distinct subtypes of breast cancer, which are prognostic across several treatment settings (3, 14-16). These five subtypes are termed ER+/Luminal A (Luminal A), ER+/Luminal B (Luminal B), HER2 enriched (HER2E), basal-like and normal breast-like. HER2E and basal-like subtypes are hormone receptor negative and typically have a poorer outcome. In contrast, the expression of ER and ER-associated genes characterizes the Luminal breast cancers. Among Luminal breast cancers, Luminal B tumors have a significantly worse outcome than Luminal A. Although some Luminal B tumors can be identified by coexpression of HER2, the major biological distinction between Luminal A and Luminal B is the proliferation signature(14, 17), which is consistently high in Luminal B tumors(14). The recurrence score(18), a quantitative reverse-transcription polymerase chair reaction (qRT- PCR) assay, divides ER positive, node negative tumors into good, intermediate, and poor prognosis subgroups based on expression of 16 genes. The proliferation signature, which includes the Ki-67 gene MKI67, is the most heavily weighted component in this assay.  These data suggest that the distinction of Luminal A vs. B, or more simply put, that proliferation status amongst ER+/Luminal patients is important. The high cost of gene expression profiling studies has limited its incorporation into most randomized clinical trials, and therefore, microarray defined proliferation status is not currently an option. To date, there has not been an available immunohistochemistry(IHC)-based  137 surrogate assay to accurately identify all Luminal B tumors. However, our previously defined Luminal B IHC surrogate of being hormone receptor positive and HER2+ does identify patients with a worse outcome (19, 20). Technically excellent IHC antibodies now exist for Ki-67, which may therefore serve as a potentially clinically valuable Luminal B marker.  We therefore sought to (1) determine the optimal cut point for the percentage of positive Ki-67 expression to define the Luminal B subtype by comparing gene expression profile assignments and IHC data from formalin fixed paraffin embedded (FFPE) tissues from a cohort of 357 invasive breast cancers; (2) determine the pure prognostic value of Ki-67 among hormone receptor positive tumors who received no adjuvant systemic therapy using an independent regional population-based series of 4,046 breast cancers; and (3) evaluate empirically the association between Ki-67 and patient outcome among hormone receptor positive tumors treated uniformly with tamoxifen as their sole adjuvant systemic therapy. We hypothesized that using an immunohistochemical panel of ER, PgR, HER2 and Ki-67 could serve as a diagnostic tool to identify the poor prognosis Luminal B subgroup of breast cancers.  MATERIALS AND METHODS Quantitative RT-PCR gene expression profiling to define breast cancer subtypes  A total of 357 FFPE tissues from invasive breast carcinomas were obtained under approved IRB protocols at the University of British Columbia and Washington University at St Louis and is hence now referred to as the UBC-WashU series in this study. In this cohort, 51% of the tumors were node positive, 56 % had tumor size > 2cm  138 and 37% were Grade 3 (Supplemental Table 5.1). A 50-gene classifier was used to make intrinsic subtype assignments from gene expression profiles generated by quantitative RT-PCR(21). Subtype prediction was done by calculating the Spearman’s rank correlation of each tumor sample to each of the five centroids (Luminal A, Luminal B, HER2-enriched, Basal-like, and Normal breast-like). Class was assigned based upon the nearest centroid with centroids constructed as previously described(21). IHC was also done for ER, PgR, HER2 and Ki-67 on these tumor cases (described below).  Tissue Microarrays  Tissue microarrays (TMAs) were constructed using FFPE archival blocks from 4,046 female primary invasive tumors, again with details previously described(19, 22, 23). All patients had been referred to the British Columbia Cancer Agency between 1986- 1992 and have staging, pathology, treatment and follow-up information available (Supplemental Table 5.2)(19). This cohort was referred to as the BCCA series in this study. The median follow-up time was 12.5 years. In British Columbia, most patients were treated with adjuvant systemic therapy according to provincial cancer management guidelines set by the British Columbia Cancer Agency(24). The guidelines provided criteria for defining high risk patients that could benefit from chemotherapy; the criteria included age, node positive, presence of lymphovascular invasion (LVI), tumor size > 2cm, histologic grade and ER levels determined by dextran charcoal ligand-binding assay. Patients considered as clinical “low risk” at the time of diagnosis during the study era (approximately 40% of the study cohort) were not given any adjuvant systemic therapy (AST). Approval was obtained from the Clinical Research Ethics Board of the  139 University of British Columbia and the British Columbia Cancer Agency to use these FFPE breast tissues in this study.  Immunohistochemistry and Fluorescent In-Situ Hybridization (FISH)  Immunohistochemistry staining for ER, PgR and HER2 interpretation was described previously(19). The Ki-67 (Clone SP6, LabVision, Fremont, CA) antibody was used at 1:200 dilution for 32 minutes following Ventana CC1 pretreatment for 30 minutes. Biomarker expression from IHC assays was scored by two anatomical pathologists, blinded to the clinicopathological characteristics and outcome, using previously established and published criteria developed on other breast cancer cohorts. ER and PR stains were considered positive if immunostaining was seen in more than 1% of tumor nuclei. For HER2 status, tumors were considered positive if scored as 3+ according to HercepTest criteria, with FISH with amplification ratio ! 2.0 used to segregate IHC equivocal (2+) cases. Ki-67 was visually scored for percentage of any tumor cell nuclear positivity above background by two pathologists (TON and DG). Samples with < 50 tumor cells present in the tissue microarray cores were considered uninterpretable.   All the stained tissue microarrays are digitally scanned, and primary image data is available for public access (http://www.gpecimage.ubc.ca; username, LuminalB; password, luminalb).   140 Immunopanel to define Luminal A and Luminal B subtypes  The expression of ER and ER-associated genes characterizes the Luminal breast cancers defined by microarray expression profiling. The immunohistochemical surrogate to define luminal breast tumors had been previously published(19, 20, 25). In this study, tumors expressing HER2 together with one of the luminal hormone receptors (ER or PR) were defined as Luminal/HER2+. Luminal/HER2+ cases are one type of Luminal B tumor, but approximately only 30% of Luminal B tumors express HER2.   We hypothesized that using the proliferation marker Ki-67 to define Luminal B tumors (i.e. hormone receptor positive, HER2 negative, and Ki-67 high) would identify a group of tumors with significantly worse survival than their Luminal A counterparts (hormone receptor positive, HER2 negative, and Ki67 low). Tumors displayed continuously variable Ki67 indices (i.e. the ratio of tumor cell nuclei positive for Ki67 immunostaining over total tumor nuclei present in a histological section). In order to determine the optimal cut-off of Ki-67, we identified tumors unambiguously classified as either Luminal A or Luminal B on the basis of gene expression profile analysis of the UBC-WashU series. Then we anchored our immunopanel (hormone receptor positive, HER2 negative) with visually assessed quantitative Ki-67 data against these assignments. In this fashion, the cutoff value was selected against the gold standard of expression profiling, as opposed to assigning a cutpoint against clinical outcome which can be difficult to extrapolate to other patient populations with treatment differences and a heterogeneous risk population.   141 Statistical Analysis  All statistical analyses were carried out using SPSS 16.0 (SPSS Inc, Chicago, IL) and R 2.6.0 (www.r-project.org). The optimal cutoff value for Ki-67 was selected using the Receiver Operative Curve (ROC) method by minimizing the sum of the observed false-positive and false-negative errors(26). The sensitivity and specificity were estimated via bootstrap adjustment for optimism(27). Differences between breast cancer subtypes with regard to clinicopathologic characteristics were examined using !2 tests. For univariate survival analysis, relapse-free survival (RFS) and breast cancer specific survival (BCSS) were estimated using Kaplan-Meier curves(28), and survival differences assessed by log-rank tests(29). For RFS, survival time was censored at death if the cause was not breast cancer, or the patient was alive without relapse on June 30, 2004. For BCSS, survival time was censored at death if the cause was not breast cancer, or if the patient was still alive on June 30, 2004 (the date when the outcome data were collected). Patients with unknown cause of death were excluded from BCSS analysis. For multivariable survival analyses, Cox regression models(30) were built to estimate the adjusted hazard ratios of Ki-67 and breast cancer subtypes with standard clinicopathological variables: age at diagnosis, histologic grade, tumor size, lymphovascular invasion (LVI) and number of positive axillary lymph nodes as a percentage of the total numbers examined. The percentage of positive/dissected lymph nodes was used in the Cox Model since it was shown to be more prognostic than categorizing breast cancers into groups of one to three or more than 4 nodes in the British Columbia population(31).  Only cases with information for all the covariates were included in the Cox regression analyses. Smoothed plots of weighted Schoenfeld  142 residuals were used to test proportional hazard assumptions(32). All tests were 2-sided and p-values less than 0.05 were considered statistically significant.   The data were assembled to provide more than 80% power for testing hypotheses regarding the biomarkers in all patients combined, as well as for patient subgroups defined by the adjuvant therapies they received.  RESULTS Determination of optimal cutoff value for Ki-67 index to identify Luminal B tumors  Among the 357 UBC-Wash U series of breast tumors with qRT-PCR gene expression profiles, and tumor subtypes as assigned by the 50 gene PAM50 qRT-PCR predictor(21), 28% (101/357) of the cases were classified as Luminal A, 19% (69/357) as Luminal B, 17% (62/357) as Her2-enriched, 28% (98/357) as Basal-like and 8% (27/357) as normal breast-like subtypes. Cases that were ER positive, HER2 negative by IHC were expected to include a mix of Luminal A and Luminal B subtypes which might be distinguishable by Ki67 index. Linking the immunohistochemical data to the expression profile assignments, there were 84 Luminal A and 60 Luminal B tumors that were positive for hormone receptor and negative for HER2.   Among these 144 Luminal tumors, the best cutoff value for immunohistochemically-determined Ki-67 index to distinguish the Luminal B subtype was 13.25% (Sensitivity: 72%, Specificity: 77%, Error rate: 27%; Supplemental Figure 1). Among these 144 cases, 17 were borderline tumors (having less than 0.100 difference  143 between their individual Spearman rank correlation coefficients to Luminal A and Luminal B centroids respectively).  We therefore repeated the analysis, excluding these borderline tumors and restricting the ROC analysis to the 74 unambigouous Luminal A and 53 unambiguous Luminal B tumors. The best cutoff value for Ki-67 IHC data remained 13.25% (Sensitivity: 77%, Specificity: 78%, error rate: 23%; Figure 5.1).   Since our objective was to deduce an IHC panel that can be clinically applied, we chose (Ki67 ! 14%) as the best cutoff to identify the Luminal B subtype for human visual assessment. On this basis, we defined Luminal A as (ER or PgR) positive and HER2 negative and Ki-67 negative/low (< 14%). Luminal B is now defined as (ER or PgR) positive and HER2 negative and Ki-67 ! 14% tumor nuclei positive). Since HER2 positive tumors are currently treated differently from other breast cancer subtypes, we categorized them as Luminal/HER2+ tumors in the survival analyses for this study, thus separating Luminal tumors into three groups (i.e. Luminal A, Luminal B and Luminal/HER2+).  Predicting survival among hormone receptor positive breast cancers using a surrogate immunopanel of ER, PgR, HER2 and Ki-67   In the BCCA series of 4,046 breast tumors, a total of 2,847 tumors were hormone receptor positive. There were 2,598 tumors with complete IHC data for ER, PgR, HER2 and Ki-67. The cases with missing data did not have significant differences with age at diagnosis, tumor size, nodal status and adjuvant systemic therapies but were associated  144 with less lymphovascular invasion (p= 0.009) and a marginal tendency for lower grade (1 or 2) (p= 0.048)   Using HER2 and Ki-67 IHC to subtype the 2,598 hormone receptor positive tumors, 59% (1530/2598) of cases became classified as Luminal A, 33% (846/2598) as Luminal B and 9% (222/2598) as Luminal/HER2+ tumors. The tumor characteristics of each luminal subtype are summarized in Table 1. The Luminal subtypes differed by age, tumor size, grade, LVI and nodal status. In comparison with Luminal A tumors, both Luminal B and Luminal/HER2+ tumors were associated with younger age at diagnosis (p= 8.4x10 -5 ), higher grade (p= 7.6x10 -35 ), larger tumor size (p= 8.2x10 -17 ), positive nodal involvement (p= 5.8x10 -6 ) and lymphovascular invasion (p= 3.4x10 -10 ).  Clinical “low-risk” patients receiving no adjuvant systemic therapy  Using the immunohistochemical surrogate definition, there were 625 Luminal A, 263 Luminal B and 55 Luminal/HER2+ tumors in the BCCA series who did not receive any adjuvant systemic therapy and were node negative at diagnosis. The 10-yr RFS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 78% (75-82), 67% (61-73) and 64% (52-78), respectively (Figure 5.2a, Log-rank p= 0.0013). The 10-yr BCSS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 92% (90-94), 79% (74-85) and 78% (67-90), respectively (Figure 5.2b, Log-rank p= 1.37x10 -6 ). In the multivariable Cox regression analyses, when compared to Luminal A, Luminal B has a hazard ratio of 1.5 for RFS and of 1.8 for BCSS. Luminal/HER2+ was also significantly associated with worse relapse-free survival (hazard ratio of 1.6) and breast cancer survival (hazard ratio  145 of 2.1) when compared to Luminal A (Table 5.2). Ki-67 and HER2 thus provide complementary prognostic information in addition to the standard clinical parameters for the study of the natural history of clinical low-risk, node negative, hormone receptor positive breast cancer patients receiving local treatment only.  Adjuvant tamoxifen subset  Using the immunohistochemical surrogate definition, there were 583 Luminal A, 303 Luminal B and 90 Luminal/HER2+ tumors in the BCCA cohort from patients treated with tamoxifen as the only adjuvant systemic therapy. The 10-yr RFS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 70% (66-74), 53% (47-59) and 51% (41-63), respectively (Figure 5.3a, Log-rank p= 2.98x10 -7 ). The 10-yr BCSS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 79% (76-83), 64% (59-70) and 57% (47-69), respectively (Figure 5.3b, Log-rank p= 1.15x10 -7 ). In the multivariable Cox regression analyses, when compared to Luminal A, both the Luminal B and Luminal/HER2+ tumors had more than a 1.5 times increased risk for relapse and/or death from breast cancer (Table 5.3). These findings held up in both the node negative and node positive subsets (Supplemental Figures 5.2 and 5.3; Supplemental Tables 5.3 and 5.4).  Combined tamoxifen and chemotherapy adjuvant subset  In the BCCA cohort, a total of 196 patients with hormone receptor positive tumors were treated both adjuvant tamoxifen and chemotherapy. Among these individuals, 124 were treated with AC or FAC (anthracycline-based) and 72 with CMF  146 (non-anthracycline). The majority of these tumors were high grade tumors, with a tumor size larger than 2 cm, positive lymphovascular invasion or more than 25% positive axillary nodes over total examined lymph nodes at diagnosis. There were no significant differences in the clinicopathological parameters and breast cancer subtypes between the anthracycline-based (AC or FAC) and non-anthracycline-based (CMF) treated cohorts (Supplemental Table 5.5).   Using the immunohistochemical surrogate definition, there were 87 Luminal A, 84 Luminal B and 25 Luminal/HER2+ tumors in this subset. The 10-yr RFS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 69% (59-79), 51% (42-63) and 42% (26-67), respectively (Figure 5.4a, Log-rank p= 0.0076). The 10-yr BCSS (95% C.I.) for Luminal A, Luminal B and Luminal/HER2+ were 78% (69-87), 58% (48-70) and 44% (28-70), respectively (Figure 5.4b, Log-rank p= 0.0025). In multivariable Cox regression analyses, breast cancer subtyping retained its independent association with survival (Table 5.4). When compared to Luminal A, both Luminal B and Luminal/HER2+ had more than a twofold increased risk for recurrence and death from breast cancer. In this subset, breast cancer subtype was the only variable that was significantly associated with breast cancer outcome.  DISCUSSION  Gene expression profiling studies have consistently revealed biologically distinct breast cancer subtypes with different prognoses(33). The Luminal B breast cancers are a clinically important, yet heterogeneous, subgroup associated with a poorer clinical  147 outcome in both the presence and absence of systemic adjuvant therapy and are therefore an important group to target for clinical investigation. Using two independent cohorts of invasive breast tumors, our study is the first to date (1) to develop a four marker surrogate IHC panel (ER, PR, HER and Ki-67) to define the Luminal B subtype against a gene expression profile gold standard, (2) with validation on an independent series of breast cancers in order to demonstrate significant associations with relapse and survival.   The Luminal B subtype exhibits increased expression of HER2 associated genes, and a cell proliferation signature that includes MKI67, CCNB1 and MYBL2, which have been associated with tamoxifen resistance(15, 34). Efficient clinical identification of Luminal B breast cancers will isolate a poor prognosis subgroup that could likely benefit from additional systemic therapy from among otherwise good prognosis, hormone receptor-positive breast cancers. As suggested from gene expression profiling data, ER/PgR and HER2 coexpression can identify some, but not all, Luminal B tumors (the Luminal/HER2+ group). However, only a minority of Luminal B tumors is HER2 positive (approximately 30%), indicating this clinical marker alone is not sensitive enough to accurately identify the majority of Luminal B cancers. In this study, we have categorized these tumors as Luminal/HER2+ as they require a distinct treatment approach involving HER2 targeted therapy (e.g. trastuzumab). However, from a biological perspective, these tumors belong to Luminal B subtype.   Ki-67 is a well-established cell proliferation marker in cancer. It is also an excellent candidate as a Luminal B biomarker as suggested by gene microarray data, for  148 several reasons. Two recent meta-analysis studies had reported a significant correlation between high Ki-67 expression and increased risk of breast cancer relapse and death(11, 35). There was a substantial overlap among the published reports included in these two meta-analysis studies. Nonetheless assessment of Ki67 has been a matter of some controversy, as some studies had used 10%(36, 37) or 20%(38, 39) cut-points, whereas other studies simply dichotomized around the mean(40) or median values(41, 42). Our present study is the first to anchor quantitative Ki-67 visual IHC data to breast cancer biologic subtypes as classified by gene expression profiling. An advantage of this approach is that the optimal threshold of Ki-67 IHC (in this case 14%) was determined against an important distinction in the underlying biology of breast cancer rather than against clinical outcome or the mean/median value of Ki-67 labeling index distribution in the study population, and therefore will more likely be reproducible in other cohorts of patients with different treatment regimens. Though gene expression profiles remain the most sensitive method, we have demonstrated that Ki-67 as a single IHC marker can be added to ER, PgR and HER2 to identify additional Luminal B cases with robust sensitivity and specificity. Thus Ki67 IHC provides an alternative economical, convenient and effective diagnostic tool for immediate use. Furthermore the addition of EGFR and CK5/6 allows identification of the basal-like subtype of breast cancer(19, 43).   We evaluated the prognostic value of our Luminal B IHC classifier using an independent, regional population-based cohort of 4,046 breast tumors. These patients received adjuvant therapy according to guidelines developed and disseminated by the British Columbia Cancer Agency(24). We confirmed the significant prognostic value of  149 our Luminal B definition in homogeneously treated subsets of breast cancer. We showed that among the patients with hormone receptor positive tumors who received no adjuvant systemic treatment, Luminal B tumors, as well as Luminal/HER2+, were significantly associated with increased risk of breast cancer relapse and death when compared with Luminal A.  In contemporary practice, almost all hormone receptor positive breast cancer patients are treated with hormonal therapy (tamoxifen or aromatase inhibitors), and in the current study very similar findings were present in the subgroup receiving adjuvant tamoxifen only.   In multivariate analysis, Luminal B and Luminal/HER2+ provided significant prognostic value beyond the current standard clinicopathological parameters. The Cox regression models included tumor size, age at diagnosis, grade, nodal involvement and lymphovascular invasion status, which are the compulsory variables for Adjuvant! and for calculation of Nottingham Prognostic Index. Indeed, almost half of our patient cohort was included in an earlier study confirming that, in the British Columbia population, Adjuvant! predictions were comparable with observed outcomes(44), providing confidence to support extrapolation of the conclusions drawn in the present study to North American and United Kingdom populations. Furthermore, Luminal B status retains its independent prognostic significance in both subsets of node negative and node positive hormone receptor positive breast cancers treated with adjuvant tamoxifen. The recurrence score is currently an approved diagnostic test to predict distant recurrence for ER positive breast cancer with negative axillary nodes treated with adjuvant tamoxifen(18), and uses MKI67 and other proliferation-associated genes to calculate a  150 recurrence risk score. This assay has not been applied to the BCCA series of tumors, and therefore limits our capacity to do a head-to-head concordance comparison between our immunopanel and this qRT-PCR based assay (the cost of which is approximately tenfold higher per case). However, Fan et al(33) have shown that breast cancer subtypes based on gene expression profiling and the recurrence score show significant agreement in their outcome predictions for individual patients. This could suggest that among the adjuvant tamoxifen treated, node negative, ER positive patients, the high recurrence scores may largely be tracking the Luminal B cancers.   The predictive value of the Luminal B subtype for response to adjuvant systemic chemotherapy treatment, if any, is unclear. A meta-analysis of 1,521 patients with endocrine-responsive tumors enrolled in two randomized trials of adjuvant chemoendocrine therapy reported that Ki-67 expression as a single marker did not predict resistance or benefit to treatment with chemotherapy in addition to the benefit incurred with hormonal therapy alone (42).  The chemotherapy regimen used was CMF in these two trials. The median value, 19%, was chosen as cut-off value for Ki-67 in this study. In contrast, in our study we assessed Ki-67 only in the context of hormone receptor positive, HER2 negative tumors, with a cut-point of 14%. Our results show that Ki-67 and HER2 can risk-stratify the breast cancer patients with hormone receptor positive disease treated with both tamoxifen and chemotherapy as their adjuvant systemic therapy. We also show that this cut point retains its independent significance in a chemotherapy and hormonal therapy treated cohort.   151  In conclusion, we have demonstrated that an immunopanel of ER, PgR, HER2 and Ki-67 can segregate Luminal A and Luminal B biological breast cancer subtypes, which carry distinct risks for breast cancer relapse and death. Our Luminal subtyping definition, which is developed and anchored to a gene expression profile gold standard, may be of relevance to evaluate the need for adjuvant chemotherapy in addition to optimal hormonal therapy for hormone receptor positive tumors.  152 Table 5.1 Patients’ Characteristics. Clinicopathological characteristics of the 2598 Luminal breast tumors from the BCCA series.  Characteristics Luminal A N: 1530 Luminal B N: 846 Luminal/HER2+ N: 222 Total N: 2598  N (% within subtype) N (% within subtype) N (% within subtype) N Age, yrs ! 40 82 (5.4) 82 (9.7) 21 (9.5) 185 40-49 261 (17.1) 175 (20.7) 49 (22.1) 485 50-65 566 (37.0) 275 (32.5) 78 (35.1) 919 > 65 621 (40.6) 314 (37.1) 74 (33.3) 1009 Grade 1 or 2 925 (60.5) 350 (41.4) 60 (27.0) 1335 3 521 (34.1) 468 (55.3) 155 (69.8) 1144 Unknown 84 (5.5) 28 (3.3) 7 (3.2) 119 Tumor, size, cm ! 2 943 (61.6) 384 (45.4) 90 (40.5) 1417 > 2 580 (37.9) 450 (53.2) 131 (59.0) 1161 Unknown 7 (0.5) 12 (1.4) 1 (0.5) 20 Number of positive axillary lymph nodes  0  844 (55.2) 422 (49.9) 90 (40.5) 1356 1-3 436 (28.5) 244 (28.8) 64 (28.8) 744 > 4 180 (11.8) 140 (16.5) 54 (24.3) 374 Unknown 70 (4.6) 40 (4.7) 14 (6.3) 124 Lymphovascular invasion  Negative 874 (57.1) 385 (45.5) 84 (37.8) 1343 Positive 590 (38.6) 423 (50.0) 133 (59.9) 1146 Unknown 66 (4.3) 38 (4.5) 5 (2.3) 109   153 Table 5.2.  Cox Model of Breast Cancer Subtypes, No AST. Multivariable Cox regression analyses to estimate the adjusted hazard ratios of Breast cancer subtypes among 883 node negative, hormone receptor positive patients with complete data for covariates and who did not receive any adjuvant systemic therapy. RFS: Relapse-free survival. BCSS: Breast cancer specific survival. LVI: lymphovascular invasion.   RFS (N = 883) BCSS (N = 879) Characteristics Hazard Ratio (95% C.I.) P-value Hazard Ratio (95% C.I.) P-value Age at diagnosis, yr 1.00 (0.99-1.01) 0.43 1.01 (0.99-1.02) 0.30 Grade (3 vs. 2 or 1) 1.10 (0.84-1.44) 0.50 1.24 (0.88-1.75) 0.22 Tumour size, cm > 2 vs. ! 2 1.43 (1.09-1.86) 0.0096 1.59 (1.14-2.23) 0.0070 LVI Positive vs. negative 1.49 (1.04-2.13) 0.031 1.72 (1.11-2.66) 0.015 Breast cancer subtypes  Luminal B vs. Luminal A 1.43 (1.08-1.90) 0.013 1.84 (1.28-2.63) 0.00092 Luminal/HER2+ vs. Luminal A 1.57 (0.97-2.54) 0.066 2.08 (1.15-3.76) 0.016    154 Table 5.3 Cox Model of Breast Cancer Subtypes, Adjuvant tamoxifen. Cox regression analyses to estimate the adjusted hazard ratios of Breast cancer subtypes among 828 hormone receptor positive patients with complete data for covariates, and who received tamoxifen as their sole adjuvant systemic therapy. RFS: Relapse-free survival. BCSS: Breast cancer specific survival. LVI: lymphovascular invasion.   RFS (N = 828) BCSS (N = 826) Characteristics Hazard Ratio (95% C.I.) P-value Hazard Ratio (95% C.I.) P-value Age at diagnosis, yr 0.99 (0.98-1.00) 0.090 1.00 (0.98-1.02) 0.95 Grade (3 vs. 2 or 1) 1.33 (1.06-1.68) 0.016 1.35 (1.04-1.75) 0.023 Tumour size, cm > 2 vs. ! 2 1.56 (1.23-1.97) 2.0x10 -4  1.64 (1.26-2.13) 2.5x10 -4  LVI Positive vs. negative 1.18 (0.91-1.51) 0.21 1.02 (0.78-1.34) 0.88 Proportion of positive axillary lymph nodes over total examined nodes  0-25% vs. 0% 1.91 (1.40-2.61) 5.2x10 -5  1.69 (1.19-2.40) 0.0037 > 25% vs. 0% 3.24 (2.38-4.42) 1.0x10 -13  3.26 (2.32-4.57) 8.7x10 -12  Breast cancer subtypes  Luminal B vs. Luminal A 1.59 (1.25-2.03) 1.7x10 -4  1.60 (1.22-2.10) 7.6x10 -4  Luminal/HER2+ vs. Luminal A 1.56 (1.09-2.25) 0.016 1.77 (1.20-2.62) 0.0039    155 Table 5.4 Cox Model of Breast Cancer Subtypes, Adjuvant tamoxifen and chemotherapy. Cox regression analyses to estimate the adjusted hazard ratios of Breast cancer subtypes among 167 hormone receptor positive tumors with complete data for all the covariates that received tamoxifen and chemotherapy (AC, FAC or CMF) as their adjuvant systemic therapy. RFS: Relapse-free survival (RFS): Breast cancer specific survival (BCSS); LVI: lymphovascular invasion.   RFS N = 167 BCSS N = 167 Characteristics Hazard Ratio (95% C.I.) P-value Hazard Ratio (95% C.I.) P-value Age at diagnosis, yr 0.97 (0.94-1.00) 0.046 0.98 (0.94-1.01) 0.13 Grade (3 vs. 2 or 1) 1.19 (0.70-2.03) 0.52 0.94 (0.54-1.62) 0.82 Tumour size, cm > 2 vs. ! 2 1.00 (0.57-1.76) 1.00 1.54 (0.82-2.90) 0.18 LVI Positive vs. negative 1.05 (0.59-1.87) 0.86 1.00 (0.54-1.84) 0.99 Proportion of positive axillary lymph nodes over total examined nodes  0-25% vs. 0% 1.54 (0.70-3.40) 0.29 2.11 (0.83-5.34) 0.12 > 25% vs. 0% 2.04 (0.96-4.30) 0.062 3.11 (1.31-7.39) 0.010 Breast cancer subtypes  Luminal B vs. Luminal A 2.03 (1.15-3.58) 0.015 1.92 (1.05-3.52) 0.034 Luminal/HER2+ vs. Luminal A 2.65 (1.23-5.71) 0.013 3.73 (1.70-8.16) 0.001   156 Supplemental Table 5.1 Clinicopathological Characteristics of qRT-PCR Breast Tumors. Clinicopathological characteristics of 357 invasive breast cancers, collected from Vancouver General Hospital and Washington University at St Louis, used for gene expression profiles.  Characteristics Frequency % Nodal Status Positive 137 38.4 Negative 183 51.3 Unknown 37 10.4 Tumor size, cm ! 2 139 38.9 2-5 170 47.6 > 5 30 8.4 Unknown 18 5.0 Grade 1 45 12.6 2 171 47.9 3 133 37.3 Unknown 8 2.2 Estrogen Receptor Positive 221 61.9 Negative 109 30.5 Uninterpretable 27 7.6 Progesterone Receptor Positive 137 39.8 Negative 142 38.4 Uninterpretable 78 21.8 HER2 Positive 35 9.8 Negative 277 77.6 Uninterpretable 45 12.6    157 Supplemental Table 5.2 Patients Characteristics of BCCA Series. Clinicopathological characteristics of 4046 invasive breast cancers included in the tissue microarrays from British Columbia Cancer Agency (Cheang et al. Clin Cancer Res. 2008; 14(5):1368-76).   158 Supplemental Table 5.3 Cox Model of Breast Cancer Subtypes, Node Negative, Adjuvant Tamoxifen. Cox regression analyses to estimate the adjusted hazard ratios of Breast cancer subtypes among 267 node negative, hormone receptor positive tumors with complete data for covariates, who received tamoxifen as their sole adjuvant systemic therapy. RFS: Relapse-free survival. BCSS: Breast cancer specific survival. LVI: lymphovascular invasion.   RFS (N = 267) BCSS (N = 267) Characteristics Hazard Ratio (95% C.I.) P-value Hazard Ratio (95% C.I.) P-value Age at diagnosis, yr 1.02 (0.99-1.05) 0.28 1.03 (1.00-1.07) 0.060 Grade (3 vs. 2 or 1) 1.63 (0.96-2.76) 0.070 1.25 (0.70-2.22) 0.45 Tumour size, cm > 2 vs. ! 2 1.42 (0.83-2.41) 0.20 1.54 (0.85-2.78) 0.16 LVI Positive vs. negative 0.87 (0.51-1.46) 0.59 0.90 (0.51-1.61) 0.73 Breast cancer subtypes  Luminal B vs. Luminal A 2.14 (1.24-3.67) 0.0061 2.22 (1.22-4.04) 0.0094 Luminal/HER2+ vs. Luminal A 1.07 (0.36-3.16) 0.90 1.04 (0.30-3.60) 0.95   159 Supplemental Table 5.4 Cox Model of Breast Cancer Subtypes, Node Positive, Adjuvant Tamoxifen. Cox regression analyses to estimate the adjusted hazard ratios of Breast cancer subtypes among 561 node positive, hormone receptor positive tumors with complete data for covariates, who received tamoxifen as their sole adjuvant systemic therapy. RFS: Relapse-free survival. BCSS: Breast cancer specific survival. LVI: lymphovascular invasion.   RFS (N = 561) BCSS (N = 559) Characteristics Hazard Ratio (95% C.I.) P-value Hazard Ratio (95% C.I.) P-value Age at diagnosis, yr 0.99 (0.97-1.00) 0.062 1.00 (0.98-1.01) 0.71 Grade (3 vs. 2 or 1) 1.32 (1.02-1.70) 0.037 1.46 (1.09-1.95) 0.011 Tumour size, cm > 2 vs. ! 2 1.67 (1.28-2.16) 0.00013 1.75 (1.30-2.34) 0.00021 LVI Positive vs. negative 1.33 (0.99-1.79) 0.057 1.09 (0.80-1.51) 0.56 Breast cancer subtypes  Luminal B vs. Luminal A 1.50 (1.14-1.97) 0.0042 1.49 (1.09-2.03) 0.013 Luminal/HER2+ vs. Luminal A 1.78 (1.21-2.62) 0.0035 2.03 (1.34-3.07) 0.00081   160 Supplemental Table 5.5 Clinicopathological Characteristics Among “High-risk” Hormone Receptor Positive Tumors. Clinicopathological characteristics of 196 hormone receptor positive tumours that received both tamoxifen and chemotherapy as their adjuvant systemic therapy. AC, doxorubicin and cyclophosphamide; FAC, flurorouracil, doxorubicin, and cyclophosphamide; CMF, cyclophosphamide, methotrexate, and fluorouracil.  Characteristics Tamoxifen + (AC or FAC) N = 124 Tamoxifen + CMF N = 72 Total N = 196 P-value  N (%) N (%) Age, yrs ! 40 12 (9.7) 13 (18.1) 25 40-49 50 (40.3) 18 (25.0) 68 50-65 58 (46.8) 37 (51.4) 95 > 65 4 (3.2) 4 (5.6) 8   0.094 Tumour size, cm ! 2 44 (36.7) 16 (22.5) 60 > 2 76 (63.3) 55 (77.5) 131 0.053 Grade 1 or 2 60 (50.0) 35 (51.5) 95 3 60 (50.0) 33 (48.5) 93 0.88 Lymphovascular invasion  Negative 36 (30.8) 25 (37.3) 61 Positive 81 (69.2) 42 (62.7) 123 0.42 Proportion of positive axillary lymph nodes / total examined lymph nodes  0 % 25 (20.7) 6 (9.2) 31 0-25 % 40 (33.1) 19 (29.2) 59 >25 % 56 (46.3) 40 (61.5) 96 0.067 Breast cancer subtypes Luminal A 52 (41.9) 35 (48.6) 87 Luminal B 53 (42.7) 31 (43.1) 84 Luminal/HER2+ 19 (15.3) 6 (8.3) 25 0.332      161 Figure 5.1. ROC Curve of Ki-67 Index. Receiver Operative Curve analysis of 127 Luminal A and B tumors with Ki-67 immunohistochemical data to identify Luminal B tumors unambiguously assigned defined by 50-gene qRT-PCR expression profile classifier. These tumors had more than 0.100 difference between their corresponding Spearman rank correlation coefficients to centroids of Luminal A and B. The selected best cutoff value for Ki-67 was 13.25%. IHC = immunohistochemistry.     162 Figure 5.2. Survival Analysis of Breast Cancer Subtypes Among Node Negative, Hormone Receptor Positive Tumors, No AST. Univariable Relapse-free survival (RFS, panel a) and Breast cancer specific survival (BCSS, panel b) analyses of Breast cancer subtypes among 943 breast cancer patients with node negative, hormone receptor positive diseases. These patients did not receive any adjuvant systemic therapy.  (a)      163 (b)     164 Figure 5.3. Survival Analysis of Breast Cancer Subtypes Among Hormone Receptor Positive Tumors, Adjuvant Tamoxifen. Univariable Relapse-free survival (RFS, Panel a) and Breast cancer specific survival (BCSS, Panel b) of Breast cancer subtypes among 976 Breast cancer patients with hormone receptor positive disease. These patients were treated with tamoxifen solely as their adjuvant systemic therapy.  (a)    165 (b)   166 Figure 5.4 Survival Analysis of Breast Cancer Subtypes Among Hormone Receptor Positive Tumors, Adjuvant Tamoxifen and Chemotherapy. Univariable Relapse-free survival (RFS, panel a) and Breast cancer specific survival (BCSS, panel b) analyses of Breast Cancer Subtypes among 196 breast cancer patients with hormone receptor positive disease. These patients were treated both tamoxifen and chemotherapy (AC or FAC or CMF) as adjuvant systemic treatments.  (a)    167 (b)   168 Supplemental Figure 5.1 ROC Curve of Ki-67 Index. Receiver Operative Curve analysis of all 144 Luminal A and B tumors with Ki-67 immunohistochemical data to identify Luminal B tumors defined by 50-gene classifier using qRT-PCR gene expression profiles. The selected best cutoff value for Ki-67 was 13.25%. IHC = immunohistochemistry.     169 Supplemental Figure 5.2. Survival Analysis of Breast Cancer Subtypes Among Patients with Node Negative, Hormone Receptor Positive Tumors, Adjuvant Tamoxifen. Univariable Relapse-free survival (RFS, Panel a) and Breast cancer specific survival (BCSS, Panel b) analyses of Breast cancer subtypes among 287 breast cancer patients with node negative, hormone receptor positive disease. These patients received tamoxifen solely as their adjuvant systemic therapy.  (a)     170 (b)     171 Supplemental Figure 5.3. Survival Analysis of Breast Cancer Subtypes Among Patients with Node Positive, Hormone Receptor Positive Tumors, Adjuvant Tamoxifen. Univariable Relapse-free survival (RFS, Panel a) and Breast cancer specific survival (BCSS, Panel b) analyses of Breast cancer subtypes among 627 breast cancer patients with node positive, hormone receptor positive disease. These patients received tamoxifen solely as their adjuvant systemic therapy.  (a)       172 (b)   173    174 References:  1. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001;98(19):10869-74. 3. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 2003;100(14):8418-23. 4. Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med 2001;344(11):783-92. 5. Tamoxifen for early breast cancer: an overview of the randomised trials. Early Breast Cancer Trialists' Collaborative Group. Lancet 1998;351(9114):1451-67. 6. Howell A, Cuzick J, Baum M, Buzdar A, Dowsett M, Forbes JF, et al. Results of the ATAC (Arimidex, Tamoxifen, Alone or in Combination) trial after completion of 5 years' adjuvant treatment for breast cancer. Lancet 2005;365(9453):60-2. 7. Mauri D, Pavlidis N, Polyzos NP, Ioannidis JP. Survival with aromatase inhibitors and inactivators versus standard hormonal therapy in advanced breast cancer: meta-analysis. J Natl Cancer Inst 2006;98(18):1285-91. 8. Berry DA, Cronin KA, Plevritis SK, Fryback DG, Clarke L, Zelen M, et al. Effect of screening and adjuvant therapy on mortality from breast cancer. N Engl J Med 2005;353(17):1784-92. 9. Domagala W, Markiewski M, Harezga B, Dukowicz A, Osborn M. Prognostic significance of tumor cell proliferation rate as determined by the MIB-1 antibody in breast carcinoma: its relationship with vimentin and p53 protein. Clin Cancer Res 1996;2(1):147-54. 10. Trihia H, Murray S, Price K, Gelber RD, Golouh R, Goldhirsch A, et al. Ki-67 expression in breast carcinoma: its association with grading systems, clinical parameters, and other prognostic factors--a surrogate marker? Cancer 2003;97(5):1321-31. 11. de Azambuja E, Cardoso F, de Castro G, Jr., Colozza M, Mano MS, Durbecq V, et al. Ki-67 as prognostic marker in early breast cancer: a meta-analysis of published studies involving 12,155 patients. Br J Cancer 2007;96(10):1504-13. 12. Ellis MJ, Coop A, Singh B, Tao Y, Llombart-Cussac A, Janicke F, et al. Letrozole inhibits tumor proliferation more effectively than tamoxifen independent of HER1/2 expression status. Cancer Res 2003;63(19):6523-31. 13. Dowsett M, Smith IE, Ebbs SR, Dixon JM, Skene A, A'Hern R, et al. Prognostic value of Ki67 expression after short-term presurgical endocrine therapy for primary breast cancer. J Natl Cancer Inst 2007;99(2):167-70. 14. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, et al. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 2006;7:96. 15. Oh DS, Troester MA, Usary J, Hu Z, He X, Fan C, et al. Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol 2006;24(11):1656-64.  175 16. Rouzier R, Perou CM, Symmans WF, Ibrahim N, Cristofanilli M, Anderson K, et al. Breast cancer molecular subtypes respond differently to preoperative chemotherapy. Clin Cancer Res 2005;11(16):5678-85. 17. Perou CM, Jeffrey SS, van de Rijn M, Rees CA, Eisen MB, Ross DT, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A 1999;96(16):9212-7. 18. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 2004;351(27):2817-26. 19. Cheang MC, Voduc D, Bajdik C, Leung S, McKinney S, Chia SK, et al. Basal- like breast cancer defined by five biomarkers has superior prognostic value than triple- negative phenotype. Clin Cancer Res 2008;14(5):1368-76. 20. Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, Conway K, et al. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 2006;295(21):2492-502. 21. Parker J, Mullins M, Cheang MCU, Davies S, Mardis E, Nielsen TO, et al. A supervised risk predictor of breast cancer based on biological subtypes. In: J Clin Oncol; 2008. p. May 20 Suppl; abstr 11008. 22. Cheang MC, Treaba DO, Speers CH, Olivotto IA, Bajdik CD, Chia SK, et al. Immunohistochemical detection using the new rabbit monoclonal antibody SP1 of estrogen receptor in breast cancer is superior to mouse monoclonal antibody 1D5 in predicting survival. J Clin Oncol 2006;24(36):5637-44. 23. Voduc D, Cheang M, Nielsen T. GATA-3 Expression in Breast Cancer Has a Strong Association with Estrogen Receptor but Lacks Independent Prognostic Value. Cancer Epidemiol Biomarkers Prev 2008;17(2):365-73. 24. Olivotto A, Coldman AJ, Hislop TG, Trevisan CH, Kula J, Goel V, et al. Compliance with practice guidelines for node-negative breast cancer. J Clin Oncol 1997;15(1):216-22. 25. Nielsen TO, Hsu FD, Jensen K, Cheang M, Karaca G, Hu Z, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res 2004;10(16):5367-74. 26. Perreard L, Fan C, Quackenbush JF, Mullins M, Gauthier NP, Nelson E, et al. Classification and risk stratification of invasive breast carcinomas using a real-time quantitative RT-PCR assay. Breast Cancer Res 2006;8(2):R23. 27. Efron B, Tibshirani R. An Introduction to the Bootstrap. Boca Raton, FL: CRC Press LLC; 1998. 28. Bland JM, Altman DG. Survival probabilities (the Kaplan-Meier method). Bmj 1998;317(7172):1572. 29. Bland JM, Altman DG. The logrank test. Bmj 2004;328(7447):1073. 30. Cox DR, Oakes D. Analysis of Survival Data. London: England: Chapman & Hall; 1984. 31. Truong PT, Berthelet E, Lee J, Kader HA, Olivotto IA. The prognostic significance of the percentage of positive/dissected axillary lymph nodes in breast cancer recurrence and survival in patients with one to three positive axillary lymph nodes. Cancer 2005;103(10):2006-14.  176 32. Grambsch P, Therneau T. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994;81(3):515-26. 33. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, et al. Concordance among gene-expression-based predictors for breast cancer. N Engl J Med 2006;355(6):560-9. 34. Marcom PK, Isaacs C, Harris L, Wong ZW, Kommarreddy A, Novielli N, et al. The combination of letrozole and trastuzumab as first or second-line biological therapy produces durable responses in a subset of HER2 positive and ER positive advanced breast cancers. Breast Cancer Res Treat 2007;102(1):43-9. 35. Stuart-Harris R, Caldas C, Pinder SE, Pharoah P. Proliferation markers and survival in early breast cancer: A systematic review and meta-analysis of 85 studies in 32,825 patients. Breast 2008. 36. Keshgegian AA, Cnaan A. Proliferation markers in breast carcinoma. Mitotic figure count, S-phase fraction, proliferating cell nuclear antigen, Ki-67 and MIB-1. Am J Clin Pathol 1995;104(1):42-9. 37. Bevilacqua P, Verderio P, Barbareschi M, Bonoldi E, Boracchi P, Dalla Palma P, et al. Lack of prognostic significance of the monoclonal antibody Ki-S1, a novel marker of proliferative activity, in node-negative breast carcinoma. Breast Cancer Res Treat 1996;37(2):123-33. 38. Clahsen PC, van de Velde CJ, Duval C, Pallud C, Mandard AM, Delobelle- Deroide A, et al. The utility of mitotic index, oestrogen receptor and Ki-67 measurements in the creation of novel prognostic indices for node-negative breast cancer. Eur J Surg Oncol 1999;25(4):356-63. 39. Joensuu H, Isola J, Lundin M, Salminen T, Holli K, Kataja V, et al. Amplification of erbB2 and erbB2 expression are superior to estrogen receptor status as risk factors for distant recurrence in pT1N0M0 breast cancer: a nationwide population-based study. Clin Cancer Res 2003;9(3):923-30. 40. Goodson WH, 3rd, Moore DH, 2nd, Ljung BM, Chew K, Mayall B, Smith HS, et al. The prognostic value of proliferation indices: a study with in vivo bromodeoxyuridine and Ki-67. Breast Cancer Res Treat 2000;59(2):113-23. 41. Liu S, Edgerton SM, Moore DH, 2nd, Thor AD. Measures of cell turnover (proliferation and apoptosis) and their association with survival in breast cancer. Clin Cancer Res 2001;7(6):1716-23. 42. Viale G, Regan MM, Mastropasqua MG, Maffini F, Maiorano E, Colleoni M, et al. Predictive value of tumor Ki-67 expression in two randomized trials of adjuvant chemoendocrine therapy for node-negative breast cancer. J Natl Cancer Inst 2008;100(3):207-12. 43. Cheang MC, van de Rijn M, Nielsen TO. Gene expression profiling of breast cancer. Annu Rev Pathol 2008;3:67-97. 44. Olivotto IA, Bajdik CD, Ravdin PM, Speers CH, Coldman AJ, Norris BD, et al. Population-based validation of the prognostic model ADJUVANT! for early breast cancer. J Clin Oncol 2005;23(12):2716-25.    177    CHAPTER SIX  BASAL-LIKE BREAST CANCER DEFINED BY FIVE BIOMARKERS HAS SUPERIOR PROGNOSTIC VALUE THAN TRIPLE-NEGATIVE PHENOTYPE 5     Basal-like breast cancer is associated with high grade, poor prognosis and younger patient age. Clinically, a triple negative phenotype definition (ER, PR and HER2 all negative) is commonly used to identify such cases. Epidermal growth factor receptor and cytokeratin 5/6 are readily available positive markers of basal-like breast cancer applicable to standard pathology specimens. This study directly compared the prognostic significance between three and five biomarker surrogate panels to define intrinsic breast cancer subtypes, using the large clinically annotated series of over 4000 breast tumors.  Reprinted with permission of Clinical Cancer Research, American Association of Cancer.  5  A version of this chapter was published. Cheang MC, Voduc D, Bajdik C, Leung S, McKinney S, Chia SK, Perou CM, Nielsen TO. Basal-like breast cancer defined by five biomarkers has superior prognostic value than triple-negative phenotype. Clinical Cancer Research 2008; 14(5):1368-76.   178 INTRODUCTION    Breast cancer is a heterogeneous disease, and by gene expression profiling has been shown to be classifiable into five major biologically-distinct intrinsic subtypes: Luminal A, Luminal B, human epidermal growth factor receptor-2 (HER2) over- expressing, basal-like, and normal-like(1-3). These molecular subtypes have prognostic and predictive value: the HER2 over-expressing and basal-like breast cancers have poor outcomes, and within the estrogen receptor (ER) positive subtypes the Luminal B cohort has a significantly worse prognosis than Luminal A. Follow up studies have shown these subtypes to be conserved across diverse patient series and array platforms(4, 5), and have demonstrated that different gene expression-based predictors are likely tracking a similar, common set of biological subtypes, with significant agreement in predicting patient outcome(6).   Among the five intrinsic subtypes, basal-like breast cancers have drawn particular attention, because they express neither ER, progesterone receptor (PR) nor HER2, and therefore would not be expected to benefit from antiestrogen hormonal therapies nor from trastuzumab(7). Cost and complexity issues have to date rendered gene expression profiling impractical as a routine hospital diagnostic tool. However, there are immunohistochemistry surrogate panels proposed which can potentially identify basal- like breast cancer, including ER-PR-HER2-negative (the “triple negative phenotype”)(8), and negative hormone receptors and HER2 but either epidermal growth factor receptor (EGFR) or cytokeratin 5/6 (CK5/6) positive (the “five marker method”)(9, 10). The triple negative phenotype (TNP) is convenient because it applies standard biomarkers already   179 routinely ordered during the clinical work-up of breast cancer biopsies; however, this approach has never been formally validated by correlating to the gold-standard gene expression profiling and relies entirely on negative results to identify basal-like breast cancers, a strategy which may have an elevated risk of false assignments for technical reasons with consequent decreased specificity. On the other hand, including EGFR and CK5/6 as positive immunohistochemical markers has previously been shown to accurately identify basal-like tumors from gene microarrray data with 100% specificity and 76% sensitivity(9).   Approximately 15% of breast cancers are basal-like and are associated with poor relapse-free and overall survival(9-11). A recent population-based study has shown that this subtype is more prevalent in pre-menopausal African American women(10), which may contribute to the poor outcomes seen among these patients. Hereditary BRCA-1 breast tumors also resemble sporadic basal-like tumors(3, 12). Basal-like breast cancers are likely to be mitotically-active high grade invasive tumors and are associated with younger patient age(13, 14). A readily-available prognostic immunohistochemical surrogate, easily applied on formalin-fixed paraffin-embedded biopsy tissues, would identify a cohort of breast cancer patients who may require more aggressive systemic therapy and who would be the most appropriate subjects for clinical trials specifically targeting the basal-like subtype.   This study aims to compare the prognostic value of two proposed surrogate immunohistochemical panels used to identity basal-like breast cancers: the triple negative   180 phenotype and the five marker Core Basal definitions. Using a regional series of more than 4000 primary invasive breast cancers with fully annotated clinical data, this report investigates the association of these two immunohistochemical panels with patient outcome.  MATERIALS AND METHODS Patients and Tissue Microarrays  The study cohort consists of 4,046 female primary invasive tumors. All patients had been referred to the British Columbia Cancer Agency between 1986-1992 and have staging, pathology, treatment and follow-up information. The median follow-up time was 12.5 years. During the study era, approximately 75% of breast cancer patients in the province were referred; the non-referred were generally elderly or treated by mastectomy without indications for adjuvant systemic therapy(15). In British Columbia, most patients were treated according to provincial guidelines developed and disseminated by the British Columbia Cancer Agency(15). These guidelines were based on age, tumor size, lymphovasular invasion, nodal status, and ER levels determined by dextran charcoal ligand-binding assay(16). High risk was defined as node-positive, or if node-negative, presence of lymphovascular invasion, or tumor > 2cm and ER negative (< 10fmol/mg). Patients considered as clinical “low risk” at the time of diagnosis during the study era were not given any adjuvant systemic therapy. Table 6.1 summarizes the tumor characteristics and treatment regimens of the breast cancer patients in this retrospective study, most of which have been previously presented(17). The Vancouver Hospital ER laboratory retained single archival blocks from each patient. Slides from these blocks   181 were stained with hematoxylin and eosin and reviewed by two pathologists to identify areas of invasive breast carcinoma. Tissue microarrays were constructed as previously described(17). A total of 17 tissue array blocks were built. This study was approved by the Clinical Research Ethics Board of the University of British Columbia and the British Columbia Cancer Agency.  Immunohistochemistry and Fluorescent In-Situ Hybridization (FISH)  Immunohistochemistry for ER, PR, HER2, EGFR and CK5/6 was performed on each set of 17 formalin-fixed paraffin embedded tissue slides using the standard streptavidin-biotin complex method with DAB chromogen. ER (Clone SP1, LabVision, Fremont, CA) antibody was used at 1:250 dilution with a 8-min microwave antigen retrieval in a 10mM (pH 6.0) citrate buffer. Ready-to-use PR (Clone 1E2, Ventana, Tucson, AZ) antibody was used following the Ventana automated stainer standard CC1 protocol. EGFR staining was performed using the EGFR pharmDx! kit (DAKO, Glostrup, Denmark) with an enzymatic antigen retrieval by proteinase K for 5 minutes. CK5/6 (Clone D5/16B4, Zymed Laboratories, South San Francisco, CA) antibody was used following the Ventana automated stainer mild CC1 protocol with 1:100 dilution. HER2 staining was performed using HER2 (Clone SP3, LabVision, Fremont, CA) antibody at 1:100 dilution with heat-induced antigen retrieval using 0.05M TRIS buffer (pH10.0) heated to 95°C in a steamer for 30 minutes. For HER2 Fluorsecent In-Situ Hybridization (FISH) assay, slides were hybridized with probes to LSI" Her-2/neu and CEP" 17 with the PathVysion" HER-2 DNA Probe Kit (Abbott Molecular Inc., Des Plaines, IL) according to the manufacturer’s instructions, with modifications to the pre-   182 treatment and hybridization as previously described(18). Slides were counterstained with DAPI and visualized on a Zeiss Axioplan epifluorescent microscope. Automated analysis of FISH signals was performed using a Metafer automated image acquisition and analysis system (Metasystems, Altlussheim, Germany). The average copy number for each probe was determined and the amplification ratio was calculated as a ratio between the average copy per cell for HER2 and the average copy number for centromere 17. Biomarker expression from immunohistochemical assays were scored by two pathologists, blinded to the clinicopathological characteristics and outcome of the breast tumors, using previously established and published criteria developed on other breast cancer cohorts. ER(17) and PR stains were considered positive if immunostaining was seen in more than 1% of tumor nuclei. EGFR and CK5/6 stains were considered positive if any (weak or strong) cytoplasmic and/or membranous invasive carcinoma cell staining was observed(9). For HER2 status, tumors were considered positive if scored as 3+ according to HercepTest criteria, and FISH with amplification ratio ! 2.0 was used to segregate immunohistochemically-equivocal (2+) results(19). All the stained tissue microarrays are digitally scanned, and available for public access (http://www.gpecimage.ubc.ca; username, basal4000; password, corebasal).    Definition of Breast Cancer Biological Subtypes by Immunohistochemistry  The immunohistochemical surrogate (ER, PR, HER2, EGFR and CK5/6) defining breast cancer subtypes has been previously published(9, 10). In this study, we use two classification schemes: the triple negative phenotype and the five biomarker method. Basal-like breast cancer is defined differently by the two schemes. Using the triple   183 negative phenotype method, basal-like is negative for all routinely tested biomarkers: ER, PR and HER2 (ER!PR!HER2!), and this surrogate definition of basal-like is referred to as TNP in this paper. Using the five biomarker method, TNP becomes divided into two groups: (a) triple negative cases (ER!PR!HER2!) which also positively express either EGFR or CK5/6, cases which are referred to as Core Basal in this paper, and (b) five marker negative phenotype (5NP), which is triple negative and furthermore expresses neither EGFR nor CK5/6. Thus, the 5NP cases represent those cases considered basal- like by the TNP method but not by the Core Basal definition. Three other biological subtype definitions are common to both schemes: HER2+/ER!PR! subtype, Luminal’ (ER+ and/or PR+, and HER2!), and Luminal/HER2+ (ER+ and/or PR+, and HER2+) (Table S1). Tumors expressing HER2 but negative for both ER and PR were defined as HER2+/ER!PR!. Tumors expressing HER2 and one of the Luminal markers (ER or PR) were defined as Luminal/HER2+. Luminal/HER2+ is not synonymous with the Luminal B expression profile subtype because only 30% to 50% of Luminal B tumors express HER2. Luminal’ includes all cases that expression profiling defines as Luminal A, as well as those remaining Luminal B tumors that do not express HER2.  Biomarker information was considered uninterpretable in cases where the tissue core was lost during sectioning or processing, or contained less than 50 visible invasive breast carcinoma cells. Tumors missing any of ER, PR or HER2 data are categorized as unassigned. The two classification schemes are described in Supplemental Table 6.1.    184 Statistical Analysis  All statistical analyses were carried out using SPSS 14.0 (SPSS Inc, Chicago, IL) and R 2.4.0 (www.r-project.org). Differences between breast cancer subtypes with regard to clinicopathologic characteristics were examined using !2 tests. For survival analysis, Breast Cancer Specific Survival (BCSS) was of primary interest. Survival time was calculated as the date of a woman’s diagnosis of breast cancer until her date of death. Survival times were censored if the primary or underlying cause of death was not breast cancer, or if the patient was still alive on June 30, 2004 (the date when the outcome data were collected). Univariate survival curves were generated by the Kaplan-Meier method(20) and differences in survival among the breast cancer subtypes were assessed by the log-rank test(21). For multivariate analysis, we built Cox regression models(22) to estimate the adjusted hazard ratios of breast cancer subtypes with standard clinicopathological variables: age at diagnosis, histologic grade, tumor size, lymphovascular invasion and number of positive axillary lymph nodes as a percentage of the total number examined(23). Only cases with information for all the covariates were included in the analysis. Smoothed plots of weighted Schoenfeld residuals were used to test proportional hazard assumptions(24). Separate Cox regression models were also built for the subsets of patients (a) receiving no adjuvant systemic therapy, to compare the prognostic values of the two basal-like subtype definitions for studying the natural history of breast cancer, and (b) receiving adjuvant chemotherapy, to estimate the additional prognostic value of EGFR and CK5/6 for defining the basal-like subtype in this setting.    185 To test the statistical significance of the additional biomarkers (EGFR and CK5/6) for defining the basal-like subtype, a likelihood ratio test (25, 26) of the differences between the nested Cox regression models was used. The null hypothesis was that the five- biomarker model did not describe BCSS differently than the three-biomarker model.   Bootstrap resampling analyses(27) were carried out (10,000 iterations) to assess the adequacy of the Cox model hazard ratio confidence intervals. In this study, bootstrapping involved randomly sampling the data with replacement and repeating the Cox regression analyses to assess the hazard ratios. We found the bootstrap confidence intervals were in close agreement with the model-based estimates, yielding no evidence that the model was overfitted to the data.  The purpose of this study was to validate findings from other studies, and to test a relatively small number of pre-specified hypotheses; accordingly, we did not perform multiple comparisons corrections. All tests were 2-sided and p-values less than 0.05 were considered statistically significant   The data were assembled to provide more than 80% power for testing hypotheses regarding the biomarkers in all patients combined, as well as for patient subgroups defined by the adjuvant therapies they received.  RESULTS   In this series of 4,046 tumors, the percentage of positive expression among interpretable cases is 69.5% (2791/4015) for ER, 13.0%  (504/3865) for HER2, 51.2%   186 (1846/3605) for PR, 13.3% (462/3478) for EGFR and 8.4% (287/3400) for CK5/6. A total of 3744 tumors with enough information for unambiguous immunohistochemical surrogate classification were defined into breast cancer biological subtypes according to the TNP and the Core Basal classification schemes respectively. There were no statistically significant survival differences between the 302 unassigned tumors and the 3,744 classifiable tumors (Log-rank P-value: 0.179). Among the cases with complete data for assignment by both schemes, 17% (639/3744) were defined as basal-like breast cancers by the triple negative definition, whereas 9.0% (336/3744) were basal-like by the Core Basal definition (with the other 303 classified as 5NP).  Clinicopathologic Characteristics of Breast Cancer Subtypes  The tumor characteristics of each breast cancer subtype are summarized in Table 6.2. With either classification scheme, the major breast cancer subtypes differ significantly by age, grade, tumor size, lymphovascular invasion and percentage of positive over total dissected axillary lymph nodes. For both the TNP and the Core Basal classification, basal-like breast cancer is associated with younger patient age, lower rates of lymphovascular invasion and with the lowest percentage of positive axillary lymph node involvement. However, the two schemes do differ, with an even larger fraction of Core Basal cases being high grade (87% Grade 3), and <40 years old (18.8%) versus 64.4% of 5NP cases that were Grade 3, and 10.2% less than 40 years. This suggests that the Core Basal classification is identifying a subset of particularly high-risk patients.    187 Breast Cancer Specific Survival by Immunohistochemical Subtype   The breast cancer subtypes as defined by the surrogate immunopanels differ significantly in predicting breast cancer specific survival (Figure 6.1). The HER2+/ER- PR- subtype has the worst breast cancer survival among the four subtypes. Comparing the survival probabilities between basal-like definitions, the 10-yr breast cancer specific survival for TNP is 67% (95% C.I. 63-70)(Figure 6.1a), but when the basal-like subtype is segregated into Core Basal and 5NP by adding EGFR and CK5/6 immunostaining information, the Core Basal group has significantly worse outcome, with an absolute 10% lower 10-yr BCSS than 5NP (Figure 6.1b).  The breast cancer subtypes maintain significant independent prognostic value to predict breast cancer death in the Cox model including the established prognostic factors (Table 6.3). Compared to Luminal’ which represents the most common subtype of breast cancers, TNP has a hazard ratio of 1.39 (95% C.I. 1.17-1.66) for breast cancer-specific death. On the other hand, the Core Basal group has 1.62 times greater risk for breast cancer-specific death (95% C.I. 1.31-2.00) whereas the 5NP group does not have statistically or clinically significant prognostic value (HR 1.16, 95% C.I. 0.91-1.49); Table 6.3). The likelihood ratio test between the two Cox Models is significant (P-value: 0.0273, Table 6.3).   Since the hazard ratio between Luminal’ and 5NP is not proportional across time, Cox regression analysis may over- or underestimate this significance. We ran   188 multivariate analyses in these TNP tumors to test the significance of Core Basal association with outcome in this subset. Relative to 5NP, Core Basal has an estimated hazard ratio of 1.47 (95% C.I. 1.08-1.99) for breast cancer death (Table 6.4a). The likelihood ratio test is also significant. Therefore, in this cohort, relying on the three biomarker classifier (ER, PR, HER2) to define basal-like tumors loses significant information to predict breast cancer outcome, compared with the five marker panel incorporating EGFR and CK5/6.  Prognostic Values of Breast Cancer Subtypes Within Treatment subsets   Among the Core Basal patients, 179 received no adjuvant systemic therapy, 48 were treated with anthracycline-based chemotherapy (32 with AC and 16 with FAC), and 55 were treated with non-anthracycline-based chemotherapy (CMF). Among the 5NP patients, 141 received no adjuvant systemic therapy, 58 anthracycline-based chemotherapy (36 AC and 22 FAC) and 31 non-anthracycline-based chemotherapy. In the no adjuvant systemic therapy subset, although Core Basal and 5NP were ER negative, predominantly grade 3 tumors, most had neither nodal involvement nor lymphovascular invasion and half were less than 2 cm (Supplemental Table 6.2).   The Core Basal patients who received no adjuvant systemic therapy (predominantly considered clinically low risk at the time) have 9% lower 10-yr BCSS than similarly-treated 5NP patients (Supplemental Figure 6.1a).    189  For patients treated with anthracycline-based adjuvant chemotherapy (Figure 6.2), the Core Basal patients have a significant 26% lower 10-yr BCSS than equivalently- treated 5NP patients (Log-rank P-value: 1.64x10 -3 ). After adjusting for age, tumor size, grade, axillary node and lymphovascular invasion status, the Core Basal group has a significantly worse survival, with a hazard ratio of 4.26 versus the 5NP cohort (95% C.I. 2.00-9.08, Table 6.4b). In this patient group, Core Basal status is the most significant prognosticator in the multivariable model, ahead of nodal involvement and tumor size. The likelihood ratio test is significant (P-value: 7.41x10 -5 ).   On the other hand, among tumors receiving non-anthracycline-based chemotherapy (Supplemental Figure 6.1b), outcomes were relatively poor with no statistically significant survival difference between Core Basal and 5NP. The tumor characteristics between non-anthracycline (CMF) versus anthracycline (AC or FAC) treated cohorts were similar (Supplemental Table 6.2); tumors treated with anthracyclines were mostly diagnosed between 1990-1992.  DISCUSSION   Using a large regional population-based cohort, this study compared two previously-established immunohistochemical surrogate panels that define breast cancer subtypes. Prognostic implications of breast cancer molecular subtypes have been described in several reports (1, 3, 8-10); however, to date no studies have been this large (>3700 patients) nor have assessed breast cancer specific survival, stratified by adjuvant   190 treatment, using the triple negative versus five biomarker Core Basal methods. The molecular subtyping of breast cancer is validated and shown to be an independent prognostic factor. We demonstrate that including positive markers (epidermal growth factor receptor and cytokeratins 5/6) for the basal subtype results in a significantly better identification of a high risk group, whose outcome more closely matches that expected by gene expression profiling(1-3, 5), than was achieved using a “triple negative” (ER!PR!HER2!) definition. Our results from the multivariable Cox regression analyses strongly suggest that among the triple negative cases, the poor prognosis is conferred almost entirely by the subset of tumors which are positive for epidermal growth factor receptor or basal cytokeratins.   Basal-like breast cancer has been suggested to be definable by negative ER, PR and HER2 immunostaining (8, 28-30), a “triple negative phenotype” which can often be extracted from existing clinical records, but this definition has never been validated with microarray data. Our results provide evidence that this definition can easily be improved upon through the use of other immunostains already commonly employed in surgical pathology laboratories. The prevalence of the triple negative phenotype (17%) in our study is consistent with a recent report which assigned 281 of 1726 cases (16.3%) as TNP(31). This study found that among TNP cases, a basal phenotype (defined using CK5/6 and CK14) in a concurrent report(32) by the same group was significantly prognostic within the node-negative subset, and further suggested that triple negative and basal definitions are associated with good response to chemotherapy (although treatments were not randomized and information on chemotherapy regimens not given). The specific   191 Core Basal definition used here and based on previous independent series was not presented and therefore direct comparisons are difficult to make. Another study, using 375 stage II breast tumors treated with tamoxifen but not adjuvant chemotherapy, defined 48 tumors as basal-like on the basis of cytokeratin 5 or 14 immunostaining, and reported no significant survival differences among ER negative tumors(33). The discrepancy is likely due to limited power and the different choice of surrogate biomarkers, as cytokeratin 14 has not been found by gene expression to be a marker for basal-like tumors. In our “pure prognostic” group of patients receiving no adjuvant systemic therapy, Core Basal (10-yr BCSS 70%) and HER2+/ER!PR! (10-yr BCSS 59%) subtypes are associated with significantly distinct breast cancer specific survival in univariable Kaplan-Meier analysis (Log-rank P-value: 0.0395), with an adjusted hazard ratio of 0.74 (95% C.I. 0.438-1.04). Our results support that basal-like phenotype breast tumors, having a different natural history than HER2+/ER!PR!, display a clinically distinct outcome, as well as distinct clinical features such as high grade, node-negative progression and higher prevalence in young patients.   By adding EGFR and CK5/6 as positive markers, a significantly worse outcome group can be identified among triple negative cases. The Core Basal definition is associated with even poorer breast cancer survival in the whole population-based cohort, and also in the anthracycline-based chemotherapy cohort, a generally high-risk group treated with similar regimens in contemporary practice. Poor outcome despite anthracycline treatment is concordant with a recent case-control study (47 basal cytokeratin-expressing breast cancers and 49 stage-matched but mainly ER+   192 controls)(34). Other studies(8, 14, 35) have reported that the basal-like subtype is a potential candidate to respond well to chemotherapy. In a neoadjuvant study, basal-like tumors (defined by triple negative phenotype) had higher rates (27%) of pathologic complete response to anthracycline-based neoadjuvant chemotherapy than luminal breast cancers(36). However, those triple negative tumors, which did not have complete response had the highest rate of relapse, potentially explaining the poor prognosis of basal-like tumors as a group despite aggressive chemotherapy. Our findings are compatible with a recent study done analyzing 823 patients from two clinical trials randomized to receive anthracyclines versus no adjuvant chemotherapy(37). In that study, a “true-basal” group defined as HER2 negative, ER negative and either EGFR or CK5/6 positive exhibited less benefit from anthracyclines than the group negative for all four of these markers.   One limitation of our study is that our cohort derives from a regional population base. Adjuvant! is a computer software program that predicts breast cancer outcomes based on SEER data and clinical trial meta-analyses to guide treatment decisions in clinical practice(38). Almost half of our dataset was used in an earlier study confirming that, in the British Columbia population, Adjuvant! predictions are comparable to observed outcomes(16), supporting extrapolation of the conclusions drawn in the present study to North American populations.   In British Columbia, most patients were treated according to provincial guidelines developed and disseminated by the British Columbia Cancer Agency. Associations   193 relying on non-randomized treatment regimens (such as the apparent poor outcomes of Core Basal over 5NP tumors in the adjuvant anthracycline subset) are best considered hypothesis-generating. Thus, our finding that the Core Basal definition may predict response of anthracycline-based adjuvant chemotherapies needs validation. Prospective clinical trial designs are clearly needed to investigate the benefit of different chemotherapy regimens in basal-like breast cancer. Use of a triple negative definition is attractive in the design of such studies as it does not necessitate additional biomarker information. However, a major implication of our current study is that relying on a triple negative phenotype definition of basal-like breast cancer will still identify a heterogeneous group with significant differences in survival, potentially obscuring important findings.   In North American and European populations, approximately 12-20% of breast cancer patients have basal-like gene expression profiles and/or a triple negative immunophenotype(1, 3, 30, 31). Our results provide strong evidence to support the use of a five biomarker surrogate (ER, PR, HER2, EGFR and CK5/6) to define the basal-like subtype, a finding of immediate relevance to prognostication and clinical trial design. Drawing on readily-available inexpensive diagnostic tools already in clinical use, this immunopanel provides a more specific definition of this aggressive form of breast cancer for which there is a particular need to improve therapeutic options.   194 Table 6.1. Clinicopathological Characteristics of Whole Cohort. Summary of Clinicopathological characteristics of 4,046 patients with invasive breast tumors included in this tissue microarray series. Abbreviations: AC, doxorubicin and cyclophosphamide; CMF, cyclophosphamide, methotrexate, and fluorouracil; FAC, fluorouracil, doxorubicin, and cyclophosphamide.  Characteristics No. of Patients % Age at diagnosis, years ! 40 380 9.4 40-49 767 19 50-65 1435 35.5 > 65 1464 36.2 Menstrual status at referral Premenopausal 1188 29.4 Postmenopausal 2761 68.2 Pregnant 2 0.1 Unknown 95 2.3 Histology Ductal 3661 90.5 Lobular 308 7.6 Other 77 1.9 Grade I (well differentiated) 211 5.2 II (moderately well or partially differentiated) 1582 39.1 III (poorly differentiated) 2069 51.1 Unknown 184 4.5 Tumor size, cm ! 2 2093 51.7   195 Characteristics No. of Patients % 2-5 1697 41.9 > 5 219 5.4 Unknown 37 0.9 Lymphovascular invasion (LVI) Positive 1750 43.3 Negative 2120 52.4 Unknown 176 4.3 Percentage of positive/total number of examined axillary lymph nodes  0% 2161 53.4 0-25% 876 21.7 > 25% 825 20.4 Unknown 184 4.5 Local Therapy No  breast surgery 60 1.5 Mastectomy + radiation therapy 631 15.6 Mastectomy alone 1557 38.5 Lumpectomy alone 135 3.3 Lumpectomy + radiation therapy 1663 41.1 Adjuvant Systemic Therapy (AST) None 1689 41.7 Tamoxifen only 1305 32.3 Ovarian ablation or hormone therapy other than tamoxifen 7 0.2 Chemotherapy only, AC 148 3.7   196 Characteristics No. of Patients % Chemotherapy AC + tamoxifen 125 3.1 Chemotherapy only, CMF 429 10.6 Chemotherapy CMF + tamoxifen 39 1 Chemotherapy only, FAC 92 2.3 Chemotherapy FAC + tamoxifen 68 1.7 Chemotherapy only (other) 70 1.7 Chemotherapy (other) + tamoxifen 69 1.7 Ovarian ablation or hormone therapy + chemotherapy 5 0.1    197 Table 6.2. Clinicopathological Characteristics Among Breast Cancer Subtypes. Clinicopathologic characteristics of the 4046 breast cancer tumors; subtypes defined by the Triple Negative Phenotype and Core Basal method. Abbreviation: LVI, lymphovascular invasion.  Triple Negative Phenotype (TNP) Method Characteristic Luminal'          Luminal/HER2+ Basal (TNP) Basal (Core Basal) 5NP Unassigned (N = 302) Total N = 2625 Total = 222 Total N = 258 Total N = 639 Total N = 336 Total N = 303 Total N = 302 Age, yr N (% within subtype) N (% within subtype) N (% within subtype) N(% within subtype) N (% within subtype) N (% within subtype) N (% within subtype) ! 40 184 (7.0) 21 (9.5) 37 (14.3) 115 (18.0) 71 (21.1) 44 (14.5) 23 (7.6) 40 - 49 476 (18.1) 49 (22.1) 45 (17.4) 149 (23.3) 79 (23.5) 70 (23.1) 48 (15.9) 50 - 65 921 (35.1) 78 (35.1) 112 (43.4) 209 (32.7) 109 (32.4) 100 (33.0) 115 (38.1) > 65 1044 (39.8) 74 (33.3) 64 (24.8) 166 (26.0) 77 (22.9) 89 (29.4) 116 (38.4) Grade N (% within subtype) N (% within subtype) N (% within subtype) N(% within subtype) N (% within subtype) N (% within subtype) N (% within subtype) I 174 (6.6) 4 (1.8) 2 (0.8) 12 (1.9) 1 (.3) 11 (3.6) 19 (6.3) II 1244 (47.4) 56 (25.2) 53 (20.5) 115 (18) 35 (10.4) 80 (26.4) 114 (37.7) III 1082 (41.2) 155 (69.8) 196 (76) 488 (76.4) 293 (87.2) 195 (64.4) 148 (49) Unknown 125 (4.8) 7 (3.2) 7 (2.7) 24 (3.8) 7 (2.1) 17 (5.6) 21 (7) Tumor size N (% within subtype) N (% within subtype) N (% within subtype) N(% within subtype) N (% within subtype) N (% within subtype) N (% within subtype) ! 2cm 1455 (55.4) 90 (40.5) 105 (40.7) 284 (44.4) 152 (45.2) 132 (43.6) 159 (52.6) 2-5 cm 1041 (39.7) 118 (53.2) 127 (49.2) 300 (46.9) 155 (46.1) 145 (47.9) 111 (36.8) > 5cm 106 (4.0) 13 (5.9) 20 (7.8) 53 (8.3) 28 (8.3) 25 (8.3) 27 (8.9) Unknown 23 (0.9) 1 (0.5) 6 (2.3) 2 (0.3) 1 (.3) 1 (.3) 5 (1.7) LVI N (% within subtype) N (% within subtype) N (% within subtype) N(% within subtype) N (% within subtype) N (% within subtype) N (% within subtype) positive 1100 (41.9) 133 (59.9) 137 (53.1) 259 (40.5) 135 (40.1) 124 (40.9) 121 (40.1) negative 1406 (53.6) 84 (37.8) 113 (43.8) 351 (54.9) 185 (55.1) 166 (54.8) 166 (55) Unknown 119 (4.5) 5 (2.3) 8 (3.1) 29 (4.5) 16 (4.8) 13 (4.3) 15 (5) Percentage of positive/total examined axillary lymph nodes N (% within subtype) N (% within subtype) N (% within subtype) N(% within subtype) N (% within subtype) N (% within subtype) N (% within subtype) 0% 1412 (54.8) 90 (40.5) 109 (42.2) 378 (59.2) 204 (60.7) 174 (57.4) 172 (57.0) 0-25% 578 (22.0) 46 (20.7) 60 (23.3) 127 (19.9) 66 (19.6) 61 (20.1) 65 (21.5) > 25% 511 (19.5) 72 (32.4) 78 (30.2) 112 (17.5) 55 (16.4) 57 (18.8) 52 (17.2) Core Basal Method   198 Table 6.3. Cox Model of Breast Cancer Subtypes. Cox regression analysis to estimate the adjusted hazard ratios of breast cancer subtypes defined by TNP and Core Basal method respectively on 3,558 cases with sufficient information for all of the variables. Hazard ratios above 1.0 indicate poorer outcome. Likelihood Ratio Test of TNP (nested model) and Core Basal (full model) methods has a p-value of 0.0273. Hazard ratios for individual clinicopathological parameters in univariable breast cancer specific survival listed for reference.  Variables Univariate analysis Multivariate analysis  HR (95%. C.I.) Nested Cox Model HR (95% C.I.) Full Cox Model HR (95% C.I. Age, yrs 40-49 vs. ! 40 0.62 (0.51-0.76) 0.73 (0.59-0.91) 0.74 (0.59-0.92) 50-65 vs. ! 40 0.65 (0.54-0.78) 0.83 (0.68-1.01) 0.84 (0.69-1.02) > 65 vs. ! 40 0.65 (0.54-0.78) 0.87 (0.71-1.07) 0.89 (0.72-1.09) Grade III vs. (II and I) 2.11 (1.86-2.39) 1.49 (1.30-1.71) 1.47 (1.28-1.69) Lymphovascular invasion  Positive vs. negative 2.28 (2.02-2.58) 1.31 (1.13-1.53) 1.32 (1.14-1.53) Tumor size, cm 2-5 vs. ! 2 2.08 (1.84-2.36) 1.65 (1.44-1.88) 1.65 (1.44-1.88) > 5 vs. ! 2 3.32 (2.69-4.09) 1.77 (1.38-2.27) 1.77 (1.39-2.27) Percentage of positive/dissected axillary lymph nodes  0-25 vs. 0 1.98 (1.70-2.31) 1.62 (1.36-1.92) 1.62 (1.37-1.93) > 25 vs. 0 3.99 (3.48-4.58) 2.88 (2.44-3.40) 2.89 (2.45-3.42) Breast cancer subtype  Luminal’as reference Luminal/HER2+  1.93 (1.55-2.40) 1.41 (1.11-1.79) 1.41 (1.12-1.79) HER2+/ER"PR"  2.27 (1.86-2.76) 1.88 (1.52-2.33) 1.89 (1.53-2.34) Unassigned  1.02 (0.81-1.29) 1.12 (0.87-1.44) 1.13 (0.99-1.45) TNP  1.50 (1.28-1.74) 1.39 (1.17-1.66) CoreBasal 1.77 (1.46-2.14)  1.62(1.31-2.00) 5NP  1.22 (0.97-1.52)  1.16 (0.91-1.49)    199 Table 6.4. Cox Model of Basal-like Tumors. (a) Cox regression analysis on 575 Basal(TNP) tumors with sufficient information for the clinicopathological covariates: age, tumour size, grade, lymphovascular invasion and percentage positive among total examined axillary lymph nodes. Likelihood Ratio Test p-value: 0.0127, df =1.  Variables Hazard Ratio (95% C.I.) Age, yrs 40-49 vs. ! 40 1.13 (0.74-1.74) 50-65 vs. ! 40 1.01 (0.67-1.53) > 65 vs. ! 40 1.11 (0.71-1.74) Grade II vs. (II and I) 1.31 (0.86-1.97) Lymphovascular invasion Positive vs. negative 1.72 (1.21-2.44) Tumor size, cm 2-5 vs. ! 2 1.68 (1.22-2.32) > 5 vs. ! 2 1.72 (1.04-2.84) Percentage of positive/total number of dissected axillary lymph nodes  0-25 vs. 0 1.57 (1.07-2.30) > 25 vs. 0 2.64 (1.77-3.96) Breast cancer subtype Basal (Core Basal) vs. 5NP 1.47 (1.08-1.99)    200 Table 6.4b. Cox regression analysis on the 95 Basal(TNP) tumors treated with anthracycline-based (AC or FAC) adjuvant chemotherapy with sufficient information for the clinicopathological covariates: age, tumour size, grade, lymphovascular invasion and percentage positive among total examined axillary lymph nodes. Likelihood Ratio Test p- value: 7.41x10 -5 , df=1.  Variables Hazard Ratio (95% C.I.) Age, yrs  ! 50 vs. < 50 1.48 (0.73-3.00) Grade II vs. (II and I) 0.38 (0.11-1.27) Lymphovascular invasion Positive vs. negative 1.19 (0.46-3.07) Tumor size, cm > 2 vs. " 2 2.20 (0.98-4.96) Percentage of positive/total number of dissected axillary lymph nodes  0-25 vs. 0 0.68 (0.20-2.28) > 25 vs. 0 5.07 (1.84-14.0) Breast cancer subtype Basal (Core Basal) vs. 5NP 4.26 (2.00-9.08)    201 Supplemental Table 6.1. Immunopanel of Breast Cancer Subtypes. The definitions of intrinsic breast cancer subtypes using five immunohistochemical surrogate markers.  ER PR HER2 HER1 CK5/6 TNP assignment Core Basal assignment ! ! + any any HER2+/ER!PR! HER2+/ER!PR! + any ! any any Luminal' Luminal' any + ! any any Luminal' Luminal' + any + any any Luminal/HER2+ Luminal/HER2+ any + + any any Luminal/HER2+ Luminal/HER2+ ! ! ! + any TNP Core Basal ! ! ! any + TNP Core Basal ! ! ! ! ! TNP 5NP  * Cases are considered unassigned if HER2 is missing, or if, for the luminal markers ER and PR, one value is missing and the other is negative. Correspondance to intrinsic expression profile assignments in Perou et al.1 is as follows: HER2 by expression profile = HER2+/ER–PR– by IHC, Luminal A by expression profile = Luminal’ by IHC, Luminal B by expression profile = Luminal/HER2+ or Luminal’ by IHC, Basal-like by expression profile = TNP or Core Basal by IHC.    202 Supplemental Table 6.2. Tumor Characteristics of Basal-like Tumors inChemotherapy Treated Cohort. Baseline tumor characteristics of Core Basal and 5NP cohorts receiving no adjuvant systemic therapy and chemotherapy (AC, FAC and CMF) treated subsets.  Core Basal, N=179 5NP, N=141 Core Basal, N=32 5NP, N=36 Core Basal, N=16 5NP, N=22 Core Basal, N=55 5NP, N=31 Age, yrs <50 54 (30%) 39 (28%) 20 (63%) 19 (53%) 9 (56%) 15 (68%) 51 (93%) 29 (94%) ! 50 125 (70%) 102 (72%) 12 (37%) 17 (47%) 7 (43%) 7 (32%) 4 (7%) 2 (6%) Grade I & II 22 (12%) 51 (36%) 4 (13%) 6 (17%) 3 (19%) 1 (5%) 3 (5%) 9 (29%) III 154 (86%) 85 (60%) 28 (87%) 27 (75%) 13 (81%) 16 (73%) 51 (93%) 21 (68%) unknown 3 (2%) 5 (4%) 3 (8%) 5 (23%) 1 (2%) 1 (3%) lymphovascular invasion positive 46 (26%) 28 (20%) 15 (47%) 16 (44%) 14 (88%) 13 (59%) 30 (55%) 23 (74%) negative 127 (71%) 110 (78%) 17 (53%) 18 (50%) 2 (12%) 5 (23%) 23 (42%) 8 (26%) unknown 6 (3%) 3 (2%) 2 (6%) 4 (18%) 2 (4%) Tumor size, cm " 2 98 (55%) 76 (54%) 15 (47%) 16 (44%) 4 (25%) 2 (9%) 21 (38%) 11 (35%) > 2 81 (45%) 64 (45%) 17 (53%) 20 (56%) 11 (69%) 20 (91%) 34 (62%) 20 (65%) unknown 1 (1%) 1 (6%) Percentage of positive/total examined axillary lymph nodes 0% 146 (82%) 124 (88%) 13 (41%) 18 (50%) 2 (13%) 5 (23%) 24 (44%) 11 (35%) 0-25% 19 (11%) 8 (6%) 13 (41%) 9 (25%) 1 (6%) 3 (14%) 22 (40%) 15 (48%)  >25% 9 (5%) 7 (5%) 6 (19%) 9 (25%) 13 (81%) 13 (59%) 9 (16%) 5 (16%) unknown 5 (3%) 2 (1%) 1 (5%) No adjuvant systemic therapy Chemotherapy AC FAC CMF   203  Figure 6.1 BCSS of Breast Cancer Subtypes.  (a) Univariable breast cancer specific survival analysis of breast cancer subtypes defined by Triple Negative Phenotype (TNP) method using an immunohistochemical surrogate of HER2, ER and PR. The log-rank p- values between Luminal/HER2+ and Luminal’ are 3.86x10 -10 ; HER2+/ER!PR! and Luminal’ 1.94x10 -17 ; Luminal/HER2+ and HER2+/ER!PR! 0.159.       204 Figure 6.1b. Univariable breast cancer specific survival analysis of breast cancer subtypes defined by the Core Basal method using an immunohistochemical surrogate of HER2, ER, PR, EGFR and CK5/6. Core Basal has a statistically significant worse survival than 5NP (Log-rank P-value = 8.58x10 -3 ); Core Basal also has a similar survival rate as HER2+/ER!PR! group for the first 3 years (Log-rank P-value = 0.087).     205 Figure 6.2. BCSS of Basal-like Tumors Receiving Adjuvant AC or FAC Chemotherapy. Univariable breast cancer specific survival analysis on Core Basal versus 5NP among the 106 triple negative (TNP) tumors receiving AC and FAC as their adjuvant chemotherapy regimen.    206 Supplemental Figure 6.1 BCSS of Basal-like Tumors, Adjuvant Treatment Subsets. Kaplan Meier survival curves of the Basal(Core Basal) and 5NP tumor groups. (a) Among the 320 Basal(TNP) tumors receiving no adjuvant systemic therapy and (b) Among the 86 Basal(TNP) tumors receiving CMF as their adjuvant systemic therapy.  (a) No adjuvant systemic therapy    207 (b) CMF as their adjuvant systemic therapy.    208  References:  1. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001;98(19):10869-74. 3. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 2003;100(14):8418-23. 4. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, et al. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 2006;7:96. 5. Sorlie T, Wang Y, Xiao C, Johnsen H, Naume B, Samaha RR, et al. Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms. BMC Genomics 2006;7:127. 6. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, et al. Concordance among gene-expression-based predictors for breast cancer. N Engl J Med 2006;355(6):560-9. 7. Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med 2001;344(11):783-92. 8. Haffty BG, Yang Q, Reiss M, Kearney T, Higgins SA, Weidhaas J, et al. Locoregional relapse and distant metastasis in conservatively managed triple negative early-stage breast cancer. J Clin Oncol 2006;24(36):5652-7. 9. Nielsen TO, Hsu FD, Jensen K, Cheang M, Karaca G, Hu Z, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res 2004;10(16):5367-74. 10. Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, Conway K, et al. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 2006;295(21):2492-502. 11. Abd El-Rehim DM, Pinder SE, Paish CE, Bell J, Blamey RW, Robertson JF, et al. Expression of luminal and basal cytokeratins in human breast carcinoma. J Pathol 2004;203(2):661-71. 12. Turner N, Tutt A, Ashworth A. Hallmarks of 'BRCAness' in sporadic cancers. Nat Rev Cancer 2004;4(10):814-9. 13. Livasy CA, Karaca G, Nanda R, Tretiakova MS, Olopade OI, Moore DT, et al. Phenotypic evaluation of the basal-like subtype of invasive breast carcinoma. Mod Pathol 2006;19(2):264-71. 14. Rodriguez-Pinilla SM, Sarrio D, Honrado E, Hardisson D, Calero F, Benitez J, et al. Prognostic significance of basal-like phenotype and fascin expression in node- negative invasive breast carcinomas. Clin Cancer Res 2006;12(5):1533-9. 15. Olivotto A, Coldman AJ, Hislop TG, Trevisan CH, Kula J, Goel V, et al. Compliance with practice guidelines for node-negative breast cancer. J Clin Oncol 1997;15(1):216-22.   209 16. Olivotto IA, Bajdik CD, Ravdin PM, Speers CH, Coldman AJ, Norris BD, et al. Population-based validation of the prognostic model ADJUVANT! for early breast cancer. J Clin Oncol 2005;23(12):2716-25. 17. Cheang MC, Treaba DO, Speers CH, Olivotto IA, Bajdik CD, Chia SK, et al. Immunohistochemical detection using the new rabbit monoclonal antibody SP1 of estrogen receptor in breast cancer is superior to mouse monoclonal antibody 1D5 in predicting survival. J Clin Oncol 2006;24(36):5637-44. 18. Brown LA, Irving J, Parker R, Kim H, Press JZ, Longacre TA, et al. Amplification of EMSY, a novel oncogene on 11q13, in high grade ovarian surface epithelial carcinomas. Gynecol Oncol 2006;100(2):264-70. 19. Yaziji H, Goldstein LC, Barry TS, Werling R, Hwang H, Ellis GK, et al. HER-2 testing in breast cancer using parallel tissue-based methods. Jama 2004;291(16):1972-7. 20. Bland JM, Altman DG. Survival probabilities (the Kaplan-Meier method). Bmj 1998;317(7172):1572. 21. Bland JM, Altman DG. The logrank test. Bmj 2004;328(7447):1073. 22. Cox DR, Oakes D. Analysis of Survival Data. London: England: Chapman & Hall; 1984. 23. Truong PT, Berthelet E, Lee J, Kader HA, Olivotto IA. The prognostic significance of the percentage of positive/dissected axillary lymph nodes in breast cancer recurrence and survival in patients with one to three positive axillary lymph nodes. Cancer 2005;103(10):2006-14. 24. Grambsch P, Therneau T. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994;81(3):515-26. 25. Hosmer DW, Lemeshow S. Applied Logistic Regression. New York: NY: Wiley & Sons; 1989. 26. Tableman M, Kim JS. Survival Analysis Using S: Analysis of Time-to-Event Data. Boca Raton: Chapman & Hall/ CRC; 2004. 27. Efron B, Tibshirani RJ. An Introduction to the Bootstrap: Chapman & Hall; 1994. 28. Brenton JD, Carey LA, Ahmed AA, Caldas C. Molecular classification and molecular forecasting of breast cancer: ready for clinical application? J Clin Oncol 2005;23(29):7350-60. 29. Jacquemier J, Padovani L, Rabayrol L, Lakhani SR, Penault-Llorca F, Denoux Y, et al. Typical medullary breast carcinomas have a basal/myoepithelial phenotype. J Pathol 2005;207(3):260-8. 30. Bauer KR, Brown M, Cress RD, Parise CA, Caggiano V. Descriptive analysis of estrogen receptor (ER)-negative, progesterone receptor (PR)-negative, and HER2- negative invasive breast cancer, the so-called triple-negative phenotype: a population- based study from the California cancer Registry. Cancer 2007;109(9):1721-8. 31. Rakha EA, El-Sayed ME, Green AR, Lee AH, Robertson JF, Ellis IO. Prognostic markers in triple-negative breast cancer. Cancer 2007;109(1):25-32. 32. Rakha EA, El-Rehim DA, Paish C, Green AR, Lee AH, Robertson JF, et al. Basal phenotype identifies a poor prognostic subgroup of breast cancer of clinical importance. Eur J Cancer 2006;42(18):3149-56. 33. Jumppanen M, Gruvberger-Saal S, Kauraniemi P, Tanner M, Bendahl PO, Lundin M, et al. Basal-like phenotype is not associated with patient survival in estrogen receptor negative breast cancers. Breast Cancer Res 2007;9(1):R16.   210 34. Banerjee S, Reis-Filho JS, Ashley S, Steele D, Ashworth A, Lakhani SR, et al. Basal-like breast carcinomas: clinical outcome and response to chemotherapy. J Clin Pathol 2006;59(7):729-35. 35. Rouzier R, Perou CM, Symmans WF, Ibrahim N, Cristofanilli M, Anderson K, et al. Breast cancer molecular subtypes respond differently to preoperative chemotherapy. Clin Cancer Res 2005;11(16):5678-85. 36. Carey LA, Dees EC, Sawyer L, Gatti L, Moore DT, Collichio F, et al. The Triple Negative Paradox: Primary Tumor Chemosensitivity of Breast Cancer Subtypes. Clin Cancer Res 2007;13(8):2329-2334. 37. Conforti R, Boulet T, Tomasic G, Taranchon E, Arriagada R, Spielmann M, et al. Breast cancer molecular subclassification and estrogen receptor expression to predict efficacy of adjuvant anthracyclines-based chemotherapy: a biomarker study from two randomized trials. Ann Oncol 2007;18(9):1477-83. 38. Ravdin PM, Siminoff LA, Davis GJ, Mercer MB, Hewlett J, Gerson N, et al. Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer. J Clin Oncol 2001;19(4):980-91.          211    CHAPTER SEVEN  GENERAL DISCUSSION AND CONCLUSION    212   A consistent theme is presented in this thesis – human breast carcinomas are classifiable into distinct molecular subtypes using immunohistochemical biomarkers. Our results show that a surrogate panel of estrogen receptor, progesterone receptor, HER2, Ki-67, epidermal growth factor receptor and cytokeratin 5/6 can recapitulate the breast cancer biologic subtypes originally identified by gene expression profiling. These subtypes are associated with significantly different survival probabilities and provide independent prognostic information in addition to the standard clinicopathological parameters (tumour size, grade, nodal status, lymphovascular invasion, patient’s age at diagnosis) used by oncologists for their decision-making about treatment options. Overall, a multidisciplinary approach has been adopted in this dissertation, combining biostatistics, pathology and oncology to associate of biomarker expression with patient outcome using large, clinically well-annotated tissue microarrays.   The tissue microarray platform is a popular and efficient tool for translational biomarker research. Chapter 3 demonstrates how this technology can be applied to compare rigorously the quality of two immunohistochemical monoclonal anti-ER antibodies against both patient outcome and ER measurements from biochemical dextran- coated charcoal assays. The results presented in this chapter contribute significantly to quality assurance for breast cancer diagnosis. Estrogen receptor testing is essential for every invasive breast cancer; thus, identification of and improved immunohistochemical methodology is of immediate relevance. Our study cohort is consisted of more than 4000 patients, and therefore gives proficient statistical power to detect survival differences among uncommon discrepant ER results between these two immunohistochemical assays    213 (SP1 antibody versus 1D5 antibody). Our results support that a large tissue microarray study linked to detailed clinical information (patient outcomes and other assay results) can identify significant improvements even in established, widely used immunohistochemical assays for an important predictive biomarker. Approximately 20% of clinical laboratories in United States are using SP1 antibody ER testing now.  6F11 is another widely used monoclonal antibody for ER testing. We are going to immunostain the same series of over 4000 patients with 6F11; then we shall perform another head-to- head comparison with SP1 antibody.   Chapter 6 applies a similar approach to the more complex question of the best way to identify the recently-identified, clinically-important basal-like subtype of breast cancer. The recognition of the basal-like subtype through gene expression profile analysis has a great impact on breast cancer(1) because these tumors do not benefit from endocrine therapy nor from trastuzumab. In current clinical practice, a triple negative phenotype (ER/PR/HER2 negative) as determined by standard clinical immunohistochemical tests for these three marches is often considered essentially synonymous to the basal-like subtype, despite the absence of a formal validation study on large series of patients(2). Our work demonstates that the poor prognosis of triple- negative phenotype breast cancers is conferred almost entirely by the subset of tumors positive for basal makers (which are readily available in hospital laboratories and commonly employed in the diagnosis of other tumors such as lung cancers). In addition, among triple-negative patients treated with anthracycline-containing adjuvant chemotherapy, the additional positive basal markers identified a cohort of patients with    214 significantly worse outcomes. This suggests that the older chemotherapy regimens (CMF) may possibly be a better choice for such patients than anthracyclines, which have serious side effects (ranging from hair loss to cardiotoxicity). One important caveat of our study, however, is that these 4000 patients were not randomized to receive different adjuvant treatment regimens. Therefore, we cannot formally address the predictive value of basal-like tumors to respond to adjuvant chemotherapy regimens. The best approach is to consider this result hypothesis-generating, and test our hypotheses on randomized clinical trials.   The MA5 trial conducted by National Cancer Institute of Canada (NCIC) appears well-suited for testing the value of basal markers to predict response to anthracycline- based chemotherapy. In this trial, pre- and peri-menopausal women with histological confirmed, axillary node-positive breast cancer were recruited between 1989 and 1993. These patients had undergone either modified radical mastectomy or lumpectomy. Patients were then randomized to receive adjuvant cyclophosphamide, epirubicin and fluorouracil (CEF – an anthracycline-based regimen) or cyclophosphamide, methotrexate and fluorouracil (CMF – the older non-anthracycline regimen). In the study as a whole, there was a statistically significant improved relapse-free survival and overall survival for patients treated with CEF compared with CMF(3, 4). However, a subsequent analysis has shown that the benefit from CEF over CMF was confined to HER2 amplified/over- expressing breast cancers(5). It still remains an open question as to whether the subset of basal-like breast cancers actually show the reverse pattern of response. In collaboration with Stephen Chia of British Columbia Cancer Agency, we have put forward a proposal    215 to the National Cancer Institute of Canada Clinical Trials Group, and successfully gained approval to test our hypothesis on MA5 trial materials. Immunohistochemistry for our biomarker panel (ER, HER2, PR, EGFR, CK5/6, Ki67) is being performed over summer 2008. For this study, Survival analyses will be done at the NCIC statistical office. We expect to see a smaller benefit from CEF for triple negative tumors expressing EGFR or CK5/6. The predictive value of the basal-like phenotype for anthracycline-based versus non-anthracycline adjuvant regimens will be assessed by comparing the outcomes of this biological subtype among patients randomly assigned to CEF versus CMF. The findings of this study could potentially help guide adjuvant treatment decision-making, by offering an option that spares patients with basal-like tumors from having to be treated with more toxic anthracycline-based chemotherapy.   Another important finding relating to my thesis work on basal-like tumors is their association with the alpha-B-Crystallin oncogene (CRYAB)(6). Torsten Nielsen, with Vince Cryns of Northwestern University previously reported that tumors over-expressing this protein had a poor prognosis, and that at least half of basal-like breast cancers express CRYAB. Using the regional population-based series of 4000 patients, we have validated the poor prognosis of CRYAB. Consistent with the previous report, 56% (176/317) of basal-like tumors express CRYAB. In the multivariable Cox regression model including lymphovascular invasion, tumor size, grade, nodal involvement, age at diagnosis and breast cancer subtypes, CRYAB remains as an independent poor prognostic marker with a hazard ratio of 1.3 (95% C.I. 1.1-1.6). Within the subset of basal tumors, CRYAB positive tumors are also independently associated with poorer    216 breast cancer survival (Appendix Figure A.1 and Appendix Table A.1) with a hazard ratio of 1.6 (p = 0.02). These findings are consistent with the oncogenic role of CRYAB in functional studies (overexpression of CRYAB increased cell migration and invasion(6)) and shows that CRYAB is a biomarker of particularly aggressive behavior among basal-like tumors. I presented this work as an oral platform presentation at the 6 th  European Breast Cancer Symposium (Berlin, Germany, April 15-19, 2008). In collaboration with Hagen Kennecke of Breast Cancer Outcomes Unit at the British Columbia Cancer Agency, a full review of clinical charts has been done to capture the sites of distant metastases of our 4000 patient tissue micrarray series. Based on experimental mouse studies by Dr. Cryns, we hypothesize that basal-like tumors will be more likely to metastasize to brain and lung if they express CRYAB. The results of this ongoing work would contribute to our understanding of the biological mechanisms of clinically aggressive behavior by basal-like tumors.   There are now many published reports identifying prognostic factors of breast cancer, some of which have presented strong conclusions before reproducibility was demonstrated. Indeed, relatively few findings have been independently validated, with low success rates partially owing to poor study design as has been discussed(7, 8). The major criticism of this kind of discovery-based research is often over-fitting of data, such as training the gene classifiers on the same data set used for assessing clinical significance, or carrying out multiple analyses in a hypotheses-generating “fishing expedition”. In order to avoid over-fitting of the data with consequent over-optimistic conclusions, we have used at least two independent cohorts of patients in the studies    217 presented in chapters 3 through 6. In chapters 4 and 5, the objective is to determine if immunohistochemical biomarkers can risk-classify hormone receptor positive tumors into low-risk Luminal A and high-risk Luminal B subtypes.   Chapter 4 demonstrates the importance of validation studies. A previous study published by others, on a cohort of 139 patients, had suggested GATA-3 to be a strong independent prognostic marker for good outcome in breast cancer(9). Using a cohort of 438 patients, we also found that GATA-3 positive expression was associated with better breast cancer survival among hormone receptor positive tumors. However, one common limitation shared by these two studies was a necessity to draw survival conclusions on data from heterogeneously treated patients. As shown in Chapter 4, in the series of 4000 patients, GATA-3 remains highly associated with ER expression, but does not seem to have prognostic value independent of ER, nor does it predict for response to tamoxifen among ER-positive patients. Although this may be considered as a negative report on GATA-3 in terms of its value as a clinical test, the fact that GATA-3 expression is so highly correlated with ER can be an interesting topic for understanding the biology of estrogen-driven breast cancer. For example, an independent study has shown that GATA- 3 is somatically mutated in some ER positive tumors(10).   In Chapter 5, we developed an immunopanel (ER, PR, HER2 and Ki-67) to define the high-risk Luminal B subtype, by reference to a gene expression profile gold standard in a cohort of 358 invasive breast tumors. We then validated the prognostic values of this panel in the series of 4000 patients. The advantage of our approach is that the    218 immunopanel is determined against an important distinction in the underlying biology of breast cancer rather than by directly linking to clinical outcome; therefore it will more likely be reproducible in other cohorts of patients independent of treatment regimens. Our results are important because Luminal B breast cancers are a clinically important subgroup associated with a poor clinical outcome in both the presence and absence of systemic adjuvant therapy. Therefore, it is an important group to target for future clinical investigation.   Relapse-free and breast cancer-specific survival were the two endpoints used for most survival analyses in this dissertation. In the contemporary setting, patients undergo either mastectomy or lumpectomy followed by radiation therapy for their local treatments. In general, traditional thought has been that because of the overwhelming importance of metastases in patient outcome, local treatments do not actually affect overall survival of the patients. A recent meta-analysis, however, reported that locoregional radiotherapy can indeed improve breast cancer survival, and it was estimated that 1 out of 4 locoregional recurrences would result in a breast cancer death(11). Currently, there is lack of clinically useful biomarkers to help guide decision- making about locoregional radiotherapy. Therefore, we compare the local and regional relapse rates among the breast cancer biologic subtypes defined by our immunopanel of 6 biomarkers, using our series of 4000 patients. We confirmed that these biologic subtypes are aggressive locally as well as systemically. These results were presented as a platform presentation at the 2008 American Society of Clinical Oncology Annual Meeting(12). When compared with the low risk Luminal A subtype, the HER2+, Basal-like and    219 Luminal B subtypes are each significantly associated with worse first regional relapse- free survival, irrespective of the type of local treatment received (Appendix Table A.2a and b). When compared with Luminal A, the Luminal/HER2 and Luminal B subtypes have a hazard ratio of 2.1 and 1.6 for local relapse, within the subset of patients who had mastectomy with or without radiation (Appendix Table A.3a). However, there is no statistically significant difference for first local relapse-free survival among the breast cancer biologic subtypes within patients who had lumpectomy plus radiation (Appendix Table A.3b). As recognizing that some breast cancer subtypes will require more aggressive local treatments is very relevant for the clinical practice of radiation oncologists, a manuscript on this topic is in preparation as a follow-up to my dissertation work.   In summary, our clinically practical breast cancer-subtyping panel appears to be of great relevance to predict patient outcome and evaluate the need for aggressive therapy. In order to move this into patient care, our immunopanel will be applied to several important prospective randomized clinical trials, in addition to the aforementioned NCIC MA5 trial. The NCIC MA12 trial is a Canadian multi-center phase III study conducted between July 1993 and April 2000, in which 672 pre-menopausal women with early breast cancer, after adjuvant chemotherapy, were randomized between tamoxifen and placebo. We aim to assess the prognostic significance of the breast cancer subtypes and the predictive significance of the Luminal A versus Luminal B subtypes for benefit from tamoxifen. This is a collaborative study involving Torsten O. Nielsen and Stephen Chia of British Columbia Cancer Agency. CALGB 9741 and CALGB 9344 are    220 American breast cancer inter-group randomized phase III adjuvant trials of dose-dense chemotherapy and paclitaxel respectively for women with node positive breast cancer. We hypothesize that Luminal B and HER2+ subtypes are sensitive to dose dense therapy. We also aim to evaluate the benefits of adjuvant taxane for the basal-like and Luminal B subtypes. This is a collaborative work involving Torsten O. Nielsen and Matthew J. Ellis of Washington University at St Louis. In closing, my dissertation may be considered as an example of translational breast cancer research that is of great interest both to pathologists in the development of clinically-relevant and practical diagnostic tests and to oncologists seeking to use such tests for clinical decisions about what therapeutic approach is most likely to benefit their patient with breast cancer.     221 References:  1. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature 2000;406(6797):747-52. 2. Kreike B, van Kouwenhove M, Horlings H, Weigelt B, Peterse H, Bartelink H, et al. Gene expression profiling and histopathological characterization of triple- negative/basal-like breast carcinomas. Breast Cancer Res 2007;9(5):R65. 3. Levine MN, Pritchard KI, Bramwell VH, Shepherd LE, Tu D, Paul N. Randomized trial comparing cyclophosphamide, epirubicin, and fluorouracil with cyclophosphamide, methotrexate, and fluorouracil in premenopausal women with node- positive breast cancer: update of National Cancer Institute of Canada Clinical Trials Group Trial MA5. J Clin Oncol 2005;23(22):5166-70. 4. Levine MN, Bramwell VH, Pritchard KI, Norris BD, Shepherd LE, Abu-Zahra H, et al. Randomized trial of intensive cyclophosphamide, epirubicin, and fluorouracil chemotherapy compared with cyclophosphamide, methotrexate, and fluorouracil in premenopausal women with node-positive breast cancer. National Cancer Institute of Canada Clinical Trials Group. J Clin Oncol 1998;16(8):2651-8. 5. Pritchard KI, Shepherd LE, O'Malley FP, Andrulis IL, Tu D, Bramwell VH, et al. HER2 and responsiveness of breast cancer to adjuvant chemotherapy. N Engl J Med 2006;354(20):2103-11. 6. Moyano JV, Evans JR, Chen F, Lu M, Werner ME, Yehiely F, et al. AlphaB- crystallin is a novel oncoprotein that predicts poor clinical outcome in breast cancer. J Clin Invest 2006;116(1):261-70. 7. Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 2003;95(1):14-8. 8. Ransohoff DF. Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev Cancer 2004;4(4):309-14. 9. Mehra R, Varambally S, Ding L, Shen R, Sabel MS, Ghosh D, et al. Identification of GATA3 as a breast cancer prognostic marker by global gene expression meta-analysis. Cancer Res 2005;65(24):11259-64. 10. Usary J, Llaca V, Karaca G, Presswala S, Karaca M, He X, et al. Mutation of GATA3 in human breast tumors. Oncogene 2004;23(46):7669-78. 11. Clarke M, Collins R, Darby S, Davies C, Elphinstone P, Evans E, et al. Effects of radiotherapy and of differences in the extent of surgery for early breast cancer on local recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005;366(9503):2087-106. 12. Cheang MC, Voduc D, Tyldesley S, Gelmon KA, Ellis MJ, Bernard PS, et al. Breast cancer molecular subtypes and locoregional recurrence. J Clin Oncol 2008;26:May 20 Suppl; abstr 510.      222               APPENDIX  223 Appendix Table A.1 Cox Model of alpha-Basic-Crystallin. Multivariable Cox regression analysis to estimate the adjusted hazard ratio of !B-crystallin among 289 basal tumors with sufficient information for all the clinicopathlogic covariates: age, tumor size, grade, lymphovascular invasion (LVI) and percentage of positive among total examined axillary lymph nodes. The likelihood ratio test p-value: 0.0200.   Hazard Ratio (95% C.I.) P-value Age at diagnosis, yrs 1.00 (0.99-1.02) 0.49 Proportion of positive nodes over total examined axillary lymph nodes  ! 25% vs. 0% 1.56 (0.95-2.59) 0.087 > 25% vs. 0% 2.58 (1.52-4.38) 0.00042 LVI Positive vs. Negative 1.66 (1.06-2.6) 0.027 Tumor size, cm > 2 vs. ! 2 1.67 (1.10-2.53) 0.016 Grade 3 vs. 2/1 1.41 (0.73-2.71) 0.31 "B-crystallin Positive vs. Negative 1.63 (1.09-2.45) 0.018   224 Appendix Table A.2 Cox Model of Regional Recurrences of Breast Cancer Subtypes. Multivariable Cox regression analysis of first regional relapse free survival to estimate the adjusted hazard ratio of breast cancer subtypes. Adjuvant systemic therapy was used as strata. (a) Mastectomy +/- radiation therapy (N = 1334). (b) Lumpectomy + radiation (N = 1062)  (a)  Hazard Ratio (95% C.I.) P-value Tumor size, cm > 2 and <= 5 vs. <= 2 1.1 (0.74-1.65) 0.64 > 5 vs. <= 2 0.96 (0.45-2.07) 0.92 Grade 3 vs. 1/2 2.15 (1.35-3.42) 0.0012 Nodal status Positive vs. negative 1.49 (0.90-2.48) 0.12 Lymphovascular invasion Positive vs. negative 1.87 (1.18-2.98) 0.008 Margin Status Positive vs. negative 0.79 (0.39-1.62) 0.52 Age at diagnosis 40-49 vs < 40 0.73 (0.37-1.43) 0.36 50-65 vs < 40 0.74 (0.41-1.35) 0.32 > 65 vs < 40 0.68 (0.36-1.28) 0.23 Radiation therapy Yes vs. no 0.78 (0.49-1.24) 0.30 Breast cancer subtypes Luminal/HER2 vs Luminal A 3.28 (1.75-6.18) 0.00023 Luminal B vs. Luminal A 2.41 (1.44-4.02) 0.00079 HER2+ vs. Luminal A 2.21 (1.10-4.41) 0.025 Basal vs. Luminal A 3.39 (1.81-6.37) 0.00014  (b)   Hazard Ratio (95% C.I.) P-value Tumor size, cm > 2 vs. <= 2 1.67 (1.02-2.74) 0.042 Grade 3 vs. 1/2 1.10 (0.63-1.91) 0.74 Nodal status Positive vs. negative 1.11 (0.98-1.25) 0.089 Lymphovascular invasion Positive vs. negative 1.76 (0.99-3.12) 0.054 Margin Status Positive vs. negative 1.31 (0.70-2.44) 0.40  225 Age at diagnosis 40-49 vs < 40 0.71 (0.35-1.45) 0.35 50-65 vs < 40 0.56 (0.27-1.16) 0.12 > 65 vs < 40 0.70 (0.33-1.48) 0.35 Breast cancer subtypes Luminal/HER2 vs Luminal A 1.29 (0.43-3.89) 0.65 Luminal B vs. Luminal A 1.78 (0.95-3.36) 0.073 HER2+ vs. Luminal A 2.87 (1.22-6.74) 0.016 Basal vs. Luminal A 3.07 (1.45-6.53) 0.0035   226 Appendix Table A.3 Cox Model of Local Recurrences of Breast Cancer Subtypes. Multivariable Cox regression analysis of first local relapse free survival to estimate the adjusted hazard ratio of breast cancer subtypes. Adjuvant systemic therapy was used as strata. (a) Mastectomy +/- radiation therapy (N = 1334). (b) Lumpectomy + radiation (N = 1062)  (a)   Hazard Ratio (95% C.I.) P-value Tumor size, cm > 2 and <= 5 vs. <= 2 1.28 (0.88-1.86) 0.20 > 5 vs. <= 2 2.37 (1.29-4.34) 0.0052 Grade 3 vs. 1/2 1.58 (1.07-2.33) 0.022 Nodal status Positive vs. negative 1.69 (1.04-2.73) 0.033 Lymphovascular invasion Positive vs. negative 1.61 (1.06-2.45) 0.025 Margin Status Positive vs. negative 0.95 (0.52-1.74) 0.86 Age at diagnosis 40-49 vs < 40 0.92 (0.50-1.70) 0.78 50-65 vs < 40 0.86 (0.49-1.52) 0.61 > 65 vs < 40 0.57 (0.31-1.05) 0.071 Radiation therapy Yes vs. no 0.60 (0.39-0.94) 0.024 Breast cancer subtypes Luminal/HER2 vs Luminal A 2.10 (1.19-3.69) 0.010 Luminal B vs. Luminal A 1.57 (1.02-2.42) 0.041 HER2+ vs. Luminal A 1.58 (0.87-2.87) 0.13 Basal vs. Luminal A 1.53 (0.84-2.76) 0.16   227 (b)   Hazard Ratio (95% C.I.) P-value Tumor size, cm > 2 vs. <= 2 0.86 (0.56-1.32) 0.49 Grade 3 vs. 1/2 1.46 (0.93-2.27) 0.097 Nodal status Positive vs. negative 1.36 (0.77-2.40) 0.29 Lymphovascular invasion Positive vs. negative 1.26 (0.77-2.07) 0.36 Margin Status Positive vs. negative 2.14 (1.34-3.41) 0.0014 Age at diagnosis 40-49 vs < 40 0.60 (0.33-1.10) 0.097 50-65 vs < 40 0.51 (0.29-0.90) 0.020 > 65 vs < 40 0.43 (0.23-0.81) 0.0093 Breast cancer subtypes Luminal/HER2 vs Luminal A 0.81 (0.31-2.07) 0.65 Luminal B vs. Luminal A 0.95 (0.56-1.61) 0.84 HER2+ vs. Luminal A 1.78 (0.92-3.45) 0.086 Basal vs. Luminal A 1.26 (0.67-2.37) 0.48    228 Appendix Figure A.1 BCSS of alpha-Basic-Crysallin Among Basal-like Tumors. Breast cancer survival curves of !B-crystallin negative versus positive among basal tumors. !B- crystallin positive tumors are associated with poor breast cancer survival (10-yr BCSS (95% C.I.) 68% (58-74) versus 57% (50-64)).   229  Ethical Approval Form   230 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0066655/manifest

Comment

Related Items