UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Gene expression and mutation profiles define novel subclasses of cytogenetically normal acute myeloid.. Pilsworth, Jessica 2016

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


24-ubc_2016_may_pilsworth_jessica.pdf [ 6.14MB ]
JSON: 24-1.0229570.json
JSON-LD: 24-1.0229570-ld.json
RDF/XML (Pretty): 24-1.0229570-rdf.xml
RDF/JSON: 24-1.0229570-rdf.json
Turtle: 24-1.0229570-turtle.txt
N-Triples: 24-1.0229570-rdf-ntriples.txt
Original Record: 24-1.0229570-source.json
Full Text

Full Text

GENE EXPRESSION AND MUTATION PROFILES DEFINE NOVEL SUBCLASSES OF CYTOGENETICALLY NORMAL ACUTE MYELOID LEUKEMIA by  Jessica Pilsworth  B.Sc., The University of Waterloo, 2013  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Medical Genetics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  April 2016  © Jessica Pilsworth, 2016  ii Abstract Acute myeloid leukemia (AML) is a genetically heterogeneous disease characterized by the accumulation of acquired somatic genetic abnormalities in hematopoietic progenitor cells. Recurrent chromosomal rearrangements are well-established diagnostic and prognostic markers. However, approximately 50% of AML cases have normal cytogenetics and have variable responses to conventional chemotherapy. Molecular markers have been begun to subdivide cytogenetically normal AML (CN-AML) and have been shown to predict clinical outcome. Despite these achievements, current classification schemes are not completely accurate and improved risk stratification is required. My overall objective was to identify specific gene expression and mutation signatures to define novel subclasses of CN-AML. I hypothesized that CN-AML would be separated into at least two or more subgroups. Gene expression and mutational profiles were established using RNA-Sequencing, clustering, de novo transcriptome assembly, and variant detection. I found the CN-AML could be separated into three groups, two of which had statistically significant survival differences (Kaplan-Meier analysis, log-rank test, p=9.75x10-3). Variant analysis revealed nine fusions that are not detectable via cytogenetic analysis and differential expression analysis identified a set of discriminatory genes to classify each subgroup. These findings contribute to the current understanding of the genetic complexity of AML and highlight gene fusion candidates for follow-up functional analyses.   iii Preface Under the supervision of Dr. Inanc Birol and in collaboration with Dr. Aly Karsan’s laboratory, I designed the project described in this thesis with advice from Dr. Ewan Gibb and Dr. Gordon Robertson. I performed all the gene expression and mutation profiling with training provided by Dr. Gordon Robertson, Readman Chiu and Ka Ming Nip. Rod Docking performed the gene expression quantification.   The BC Cancer Agency’s Clinical Research Ethics Board has approved this work under certificate number H13-02687.     iv Table of Contents  Abstract .......................................................................................................................................... ii!Preface ........................................................................................................................................... iii!Table of Contents ......................................................................................................................... iv!List of Tables ............................................................................................................................... vii!List of Figures ............................................................................................................................. viii!List of Abbreviations ................................................................................................................... ix!List of Genes ................................................................................................................................. xi!Acknowledgements ..................................................................................................................... xv!Dedication ................................................................................................................................... xvi!Chapter 1: Introduction ............................................................................................................... 1!1.1! Background and motivation ........................................................................................... 1!1.2! Pathogenesis ................................................................................................................... 2!1.3! Classification of AML ................................................................................................... 3!1.3.1! Diagnostic procedures ............................................................................................. 5!1.3.2! Prognostic factors .................................................................................................... 5!1.3.3! Standard treatment .................................................................................................. 6!1.4! Genetics of AML ........................................................................................................... 7!1.4.1! Core-binding factor leukemias ................................................................................ 8!1.4.2! Acute promyelocytic leukemia ............................................................................... 8!1.4.3! 11q23 rearrangements (KMT2A) ............................................................................. 9!1.4.4! Complex karyotypes, trisomies, and monosomal karyotypes ............................... 10!1.4.5! Recurrent mutations .............................................................................................. 11! v 1.5! Results from The Cancer Genome Atlas project ......................................................... 19!1.6! Gene expression profiling ............................................................................................ 21!1.7! Summary ...................................................................................................................... 21!1.8! Hypothesis .................................................................................................................... 22!1.9! Thesis objectives .......................................................................................................... 22!Chapter 2: Methods .................................................................................................................... 23!2.1! Summary ...................................................................................................................... 23!2.2! Primary AML samples ................................................................................................. 23!2.3! Library construction and sequencing ........................................................................... 23!2.4! Gene expression quantification .................................................................................... 24!2.5! Unsupervised consensus clustering and survival analysis ........................................... 24!2.6! Transcriptome assembly and variant analysis .............................................................. 26!2.7! Cutoff optimization and survival analysis of significant genes ................................... 27!2.8! Differential expression analysis ................................................................................... 27!2.9! Comparison of significant genes .................................................................................. 28!2.10! Group-discriminatory genes ....................................................................................... 28!Chapter 3: Results ....................................................................................................................... 30!3.1! Overview ...................................................................................................................... 30!3.2! Gene expression and mutation signatures define AML subtypes and CN-AML ........ 33!3.3! Gene expression profiles and novel fusions define distinct subclasses of CN-AML .. 37!3.4! Gene expression of discriminatory gene sets classify CN-AML ................................. 46!Chapter 4: Discussion ................................................................................................................. 63!4.1! Conclusions and future directions ................................................................................ 70! vi Bibliography ................................................................................................................................ 71!  vii List of Tables  Table 1.1: World Health Organization classification of AML in 2008 .......................................... 4!Table 1.2: Revised risk stratification in 2012 ............................................................................... 18!Table 1.3: Improved functional categories of genetic changes in AML ....................................... 20!Table 3.1: Patient characteristics .................................................................................................. 34!Table 3.2: Fusions identified in CN-AML cases in the PMP cohort ............................................ 43!Table 3.3: Genetic alterations in CN-AML cases from the TCGA cohort ................................... 44!  viii List of Figures  Figure 3.1: Workflow of the analyses performed ......................................................................... 32!Figure 3.2: Unsupervised gene expression patterns ...................................................................... 35!Figure 3.3: Kaplan-Meier survival analysis between groups ........................................................ 36!Figure 3.4: Consensus clustering of CN-AML samples ............................................................... 40!Figure 3.5: Unsupervised gene expression patterns for CN-AML cases ...................................... 41!Figure 3.6: OncoPrint of fusions identified in CN-AML cases .................................................... 42!Figure 3.7: Kaplan-Meier survival analysis of gene expression ................................................... 45!Figure 3.8: Differentially expressed genes in the PMP and TCGA cohorts ................................. 49!Figure 3.9: Significant associations of differentially expressed genes between groups ............... 50!Figure 3.10: Differentially expressed genes of significant group associations ............................ 51!Figure 3.11: Greatest fold-changes in groups from the PMP and TCGA cohorts (next 2 pages) 52!Figure 3.12: Discriminatory genes for groups in the PMP and TCGA cohorts (next 7 pages) .... 55!!  ix List of Abbreviations  2HG   2-hydroxyglutarate AML   Acute myeloid leukemia  APL   Acute promyelocytic leukemia ARTA   All-trans-retinoic acid CBF    Core-binding factor  CD34   Cluster of differentiation 34 antigen CDF   Cumulative distribution function CML   Chronic myeloid leukemia CN-AML  Cytogenetically normal acute myeloid leukemia CNVs   Copy number variations CR    Complete remission ENU    Ethyl-nitrosourea FAB   French-American-British HPCs   Hematopoietic progenitor cells ITD   Internal tandem duplication JM   Juxtamembrane  LZ    Leucine zipper MDS   Myelodyplastic syndrome  MPN   Myeloproliferative neoplasms NK    Normal karyotype OOB   Out-of-bag (error rate) OS   Overall survival  P1   Personalized Medicine Project group 1 P2   Personalized Medicine Project group 2 P3   Personalized Medicine Project group 3 PE    Paired-ended PMP   Personalized Medicine Project PRC2    Polycomb-repressive complex 2 PTD   Partial tandem duplications  x RBCs   Red blood cells RPKM   Reads per kilobase per million mapped reads  RT-PCR  Reverse transcriptase-polymerase chain reaction sAML   Secondary AML SAM   Significant Analysis of Microarrays SCT   Stem cell transplant T1   The Cancer Genome Atlas group 1 T2   The Cancer Genome Atlas group 2 T3   The Cancer Genome Atlas group 3 T4   The Cancer Genome Atlas group 4 TF   Transcription factor T-ALL   T-cell acute lymphocytic leukemia tAML   Therapy-related AML TCGA   The Cancer Genome Atlas TKD   Tyrosine kinase domain WBCs   White blood cells WHO    World Health Organization      xi List of Genes  ACIN1   apoptotic chromatin condensation inducer 1 AKT   v-akt murine thymoma viral oncogene homolog 1 ALAS2   5'-aminolevulinate synthase 2 ALK   anaplastic lymphoma receptor tyrosine kinase ARHGAP15  Rho GTPase activating protein 15 ASXL1   additional sex combs like 1, transcriptional regulator ASXL2   additional sex combs like 2, transcriptional regulator BAALC  brain and acute leukemia, cytoplasmic C17orf87  SLP adaptor and CSK interacting membrane protein CBFB   core-binding factor, beta subunit CD4   CD4 molecule CEBPA  CCAAT/enhancer binding protein alpha CREBBP  CREB binding protein DEK   DEK proto-oncogene DMXL2  Dmx like 2 DNMT3A  DNA (cytosine-5-)-methyltransferase 3 alpha DNMT3B  DNA (cytosine-5-)-methyltransferase 3 beta DOT1L  DOT1-like histone H3K79 methyltransferase EP300   E1A binding protein p300 EPB41   erythrocyte membrane protein band 4.1 EPOR   erythropoietin receptor ERG   v-ets avian erythroblastosis virus E26 oncogene homolog EVI1/MECOM MDS1 and EVI1 complex locus EZH2   enhancer of zeste 2 polycomb repressive complex 2 subunit FGF13  fibroblast growth factor 13 FLT3   fms related tyrosine kinase 3 FN1   fibronectin 1 GAS5   growth arrest specific 5 (non-protein coding) GATA1  GATA binding protein 1 (globin transcription factor 1)  xii GATA2   GATA binding protein 2 (globin transcription factor 2) GPR128  G protein-coupled receptor 128 HBA1   hemoglobin subunit alpha 1 HBA2   hemoglobin subunit alpha 2 HBB   hemoglobin subunit beta HGF   hepatocyte growth factor HNRNPK  heterogeneous nuclear ribonucleoprotein K HOXA10  homeobox A10 HOXA9  homeobox A9 HOXB   homeobox B cluster IDH1   isocitrate dehydrogenase 1 (NADP+) IDH2   isocitrate dehydrogenase 2 (NADP+), mitochondrial IQGAP1  IQ motif containing GTPase activating protein 1 KAT6A  lysine acetyltransferase 6A KIT   KIT proto-oncogene receptor tyrosine kinase KMT2A/MLL   lysine (K)-specific methyltransferase 2A KRAS   Kirsten rat sarcoma viral oncogene homolog MAFB   v-maf avian musculoaponeurotic fibrosarcoma oncogene B MAFK   v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog K MAP1A  microtubule associated protein 1A MAPK1  mitogen-activated protein kinase 1 MAST3  microtubule associated serine/threonine kinase 3 MEIS1   Meis homeobox 1 MEST   mesoderm specific transcript MLK1/ MAP3K9 mitogen-activated protein kinase 9 MLLT1  myeloid/lymphoid or mixed-lineage leukemia; translocated to, 1 MLLT3  myeloid/lymphoid or mixed-lineage leukemia; translocated to, 3 MLLT4  myeloid/lymphoid or mixed-lineage leukemia; translocated to, 4 MLLT10  myeloid/lymphoid or mixed-lineage leukemia; translocated to, 10 MN1   meningioma (disrupted in balanced translocation) 1 MST1   macrophage stimulating 1    xiii MYH11  myosin, heavy chain 11, smooth muscle NCOA2  nuclear receptor coactivator 2 NCOA3  nuclear receptor coactivator 3 NLRP12  NLR family, pyrin domain containing 12 NPM1   nucleophosmin (nucleolar phosphoprotein B23, numatrin) NRAS    neuroblastoma RAS viral (v-ras) oncogene homolog NUP214  nucleoporin 214kDa OAZ1   ornithine decarboxylase antizyme 1 PHF6   PHD finger protein 6 PIM3   Pim-3 proto-oncogene, serine/threonine kinase PML   promyelocytic leukemia PI3K   phosphatidylinositol-4,5-bisphosphate 3-kinase  PTEN   phosphatase and tensin homolog RAB1A  RAB1A, member RAS oncogene family RAD21  RAD21 cohesin complex component RARA    retinoic acid receptor alpha RAS   Rat sarcoma viral oncogene RBM15  RNA binding motif protein 15 RPN1   ribophorin I RTF1   RTF1 homolog, Paf1/RNA polymerase II complex component RUNX1  runt related transcription factor 1 RUNX1T1  runt related transcription factor 1; translocated to, 1  SCO2    SCO2 cytochrome c oxidase assembly protein SF3B1   splicing factor 3b subunit 1 SIRPB1  signal regulatory protein beta 1 SMC1A  structural maintenance of chromosomes 1A SMC2   structural maintenance of chromosomes 2 SMC3   structural maintenance of chromosomes 3 SNHG8  small nucleolar RNA host gene 8 SORBS3  sorbin and SH3 domain containing 3 SRSF2   serine/arginine-rich splicing factor 2  xiv STAG2   stromal antigen 2 STAT3   signal transducer and activator of transcription 3 STAT5   signal transducer and activator of transcription 5 TET2   tet methylcytosine dioxygenase 2 TFG   TRK-fused gene TP53   tumor protein p53 TP53BP1  tumor protein p53 binding protein 1 TTYH3   tweety family member 3 U2AF1/U2AF35 U2 small nuclear RNA auxiliary factor 1 UBE2G2  ubiquitin conjugating enzyme E2 G2 UBQLN1  ubiquilin 1 VCAN   versican WT1   Wilms tumor 1 ZNF774  zinc finger protein 774 ZRSR2   zinc finger, RNA binding motif and serine/arginine rich 2  xv Acknowledgements  My sincere thanks goes to Dr. Inanc Birol, who provided me with the opportunity to join his team as a graduate student and to grow and develop as a scientist under his guidance, mentorship, and constructive feedback.  I would like to thank my committee Dr. Aly Karsan and Dr. Matthew Lorincz for their insightful comments and encouragement.  To all the wonderful members of the Bioinformatics Technology Lab, thank you for making my time at the Genome Sciences Centre enjoyable. Special thanks to Dr. Ewan Gibb for his guidance and advice as well as to Ka Ming Nip and Readman Chiu for their technical support.  I would especially like to thank Dr. Gordon Robertson, who has been the most valuable resource a graduate student and inspiring independent researcher could ever hope for. Thank you for sharing your knowledge with me. The completion of this body of work in the set time period would never have been possible without you. I would like to express my gratitude to the BC Cancer Foundation, Genome BC, and Genome Canada for funding this project. Thank you to my colleagues from Dr. Aly Karsan’s lab, and in particular Rod Docking, who provided insight and knowledge that greatly assisted the research. I also would like to thank Cheryl Bishop, the Medical Genetics program assistant, who always promptly answered my questions and gave me advice.  Lastly, thank you to my parents for their unconditional love and encouragement and to James Lawson for his unwavering support throughout the completion of this thesis.   xvi Dedication  Dedicated to my family   A special thanks to my loving parents, John and Mariann Pilsworth, who supported my decision to move away from home and pursue my career in academia. Many long phone conversations of words of encouragement and emotional support have helped me through the last few years. To my sister, Shannon, thank you for always being there for me and giving me advice.   To my best friend and partner James Lawson, who has watched countless practice presentations, given me feedback, and has been my rock throughout my entire writing process.   1 Chapter 1: Introduction  1.1 Background and motivation Acute myeloid leukemia (AML) is a heterogeneous disease with diverse genetic abnormalities that accumulate in hematopoietic progenitor cells (HPCs).1 It is characterized by the abnormal growth of immature white blood cells (WBCs) that have lost their ability to differentiate and undergo apoptosis. These immature blood cells accumulate in the bone marrow, and interfere with the production of healthy blood cells.2 This decreases the amount of platelets, red blood cells (RBCs) and normal WBCs, leading to a variety of symptoms. These include fatigue and shortness of breath caused by anemia, infections caused by neutropenia, and easy bruising and bleeding caused by thrombocytopenia.1  AML most frequently occurs in early childhood or later adulthood.3 It is the most common type of leukemia in adults, and has the lowest survival among all leukemias.1 In the Western world, approximately 25% of leukemias in adults are diagnosed as AML.2 Previous literature has reported that the incidence of AML increases gradually with age, and that there is a positive correlation between younger age at diagnosis and AML survival.3-5 Furthermore, the incidence varies with gender and ethnicity. Multiple studies have shown that AML is predominant in males and more prevalent in Caucasians compared with other ethnic groups.2,6,7 In the majority of cases, there is no direct cause of AML, although there are a variety of risk factors that have been associated with the disease.2 Some inherited disorders such as Down syndrome, trisomy 8, and Fanconi anemia have been linked to a higher risk of AML.2,8,9 Other risk factors include irradiation and chemical exposure to benzene.2 Progression of pre-existing blood disorders such as myeloproliferative neoplasms (MPN) and myelodysplastic syndromes   2 (MDS) can result in secondary AML (sAML)10, or prior cytotoxic chemotherapy for a different malignancy can cause therapy-related AML (tAML).11 With current chemotherapy regimens the overall survival rate is 35-40% for patients under the age of 60 and 10-15% for patients over the age of 60.12 Karyotyping or cytogenetic analysis of tumour DNA performed at diagnosis is the most important prognostic factor for predicting patient response.13 This entails counting the number of chromosomes and visually examining the chromosomes for structural changes. More recently, molecular profiling using next-generation sequencing has identified new molecular markers that improve risk stratification, particularly in patients with a normal karyotype (NK).14 The identification of novel prognostic markers in the cytogenetically normal AML (CN-AML) group is critical, as current classification schemes do not include the variability in outcome for patients with NKs.15 Discovery of recurrent gene mutations has revealed the genetic heterogeneity within cytogenetically defined subgroups of AML. These genetic aberrations are now the most important prognostic markers for determining risk assessment and selecting appropriate therapy, although a better classification scheme incorporating the heterogeneity of NK patients’ responses is still required. 16,17   1.2 Pathogenesis  AML pathogenesis is thought to follow a model derived from Knudson’s two-hit hypothesis, which proposes that most cancers require at least two mutations, with the first being either germinal or somatic and the second always somatic.12,18 In AML, it is defined by Class 1 and Class 2 mutations, where both types of mutations are required for the development and progression of AML. Class 1 mutations are activating and allow a proliferative and survival advantage to the cancerous HPCs, but do not affect their differentiation; whereas Class 2   3 mutations impair hematopoietic differentiation and subsequent apoptosis of cells.19 Most AML cases are associated with a set of somatic mutations in HPCs accumulated throughout an individual’s life, where initiating mutations are present in landscaping genes such as epigenetic modifiers and late mutations affect proliferative genes involved in activated signaling.16 Multiple studies have demonstrated that AML genomes contain a high number of pre-existing mutations, therefore only a small fraction of the mutations in the leukemic cells are relevant for pathogenesis and disease classification.20,21 A mutation in FLT3 is a prime example of an activating Class 1 mutation and has been identified across AML subtypes. Other examples of Class 1 mutations are activating mutations in KIT, NRAS, or KRAS.16,22 Class 2 mutations include gene fusions between RUNX1-RUNX1T1, CBFB-MYH11, PML-RARA, and rearrangements involving KMT2A.16,22   1.3 Classification of AML In the past, AML was diagnosed using criteria from the French-American-British (FAB) group, which is based only on morphological and cytochemical examination of the bone marrow and peripheral blood of leukemia patients.22,23 In 1997, the World Health Organization (WHO) developed a new classification scheme for AML, which uses morphological and cytochemical examination but also incorporates recurrent chromosomal rearrangements and molecular abnormalities.22,24 The classification criterion has since been updated in 2008 and is currently undergoing revision again.23,25 The main categories of current classification include AML with recurrent genetic abnormalities, AML with myelodysplasia-related changes, therapy-related AML, and AML not otherwise specified (Table 1.1).23    4 Table 1.1: World Health Organization classification of AML in 2008 AML with recurrent genetic abnormalities AML with t(8;21)(q22;q22); RUNX1-RUNX1T1 AML with inv(16)(p13.1q22) or t(16;16)(p13.1;q22); CBFB-MYH11 AML with t(15;17)(q22;q12); PML-RARA AML with t(9;11)(p22;q23); MLLT3-MLL AML with t(6;9)(p23;q34); DEK-NUP214 AML with inv(3)(q21q26.2) or t(3;3)(q21;q26.2); RPN1-EVI1 AML (megakaryoblastic) with t(1;22)(p13;q13); RBM15-MLK1 Provisional AML with mutated NPM1 Provisional AML with mutated CEBPA AML with myelodysplasia-related changes Therapy-related myeloid neoplasms (tAML) AML, not otherwise specified AML with minimal differentiation AML without maturation AML with maturation Acute myelomonocytic leukemia Acute monoblastic and monocytic leukemia Acute erythroid leukemia Acute megakaryoblastic leukemia Acute basophilic leukemia Acute panmyelosis with myelofibrosis *Adapted from Vardiman et al., 2008    5 1.3.1 Diagnostic procedures The routine diagnostic procedure for a patient suspected to have AML is a bone marrow aspirate. Smears of the blood and bone marrow are analyzed under a microscope and for a diagnosis of AML the smear must contain a blast cell count of  >20%.26 Flow cytometry is used for immunophenotyping to identify cell lineage involvement. The surface markers expressed on each cell are recorded to determine the diagnosis of AML with minimal differentiation, expressing mostly early hematopoiesis-associated antigens such as CD34.26,27 A diagnostic karyotype to determine any structural changes is required since chromosomal abnormalities have been detected in approximately 55% of AML cases.13,28 To establish a patient’s karyotype, at least 20 metaphase cells are examined from the bone marrow.26,29 A karyotype has the ability to predict response to treatment, relapse risk, and overall survival in many cases.30 Moreover, DNA and RNA are usually extracted from the bone marrow or blood of patients, especially those with no chromosomal aberrations, to test for mutations in particular genes that carry prognostic significance. Using reverse transcriptase-polymerase chain reaction (RT-PCR) and amplicon sequencing, mutations in NPM1, CEBPA, and FLT3 are tested.31 A diagnostic karyotype in combination with testing of mutation status for these genes is the basis for risk-adapted treatment approaches.32  1.3.2 Prognostic factors  The main risk factors include age, WBC count, diagnosis of prior MDS or MPN, previous chemotherapy10,11 and most importantly cytogenetic and molecular abnormalities.26 Karyotype analysis is the best prognostic predictor for response to induction chemotherapy and survival.16,32 There are three risk categories: favourable, intermediate, and unfavourable. Patients   6 containing inv(16), t(8;21), or t(15;17) are put into the favourable response group, while patients with a complex karyotype with more than three chromosomal abnormalities or patients with loss or partial loss of chromosomes 5 or 7 are categorized as unfavourable.26,33 Generally, CN-AML patients are classified in the intermediate risk group, but each patient has variable responses to chemotherapy.26,33 For example, CN-AML patients with an internal tandem duplication (ITD) in FLT3 usually respond poorly to treatment, whereas those with a NPM1 mutation (without FLT3-ITD) usually respond better to treatment and reach complete remission (CR).34 Patients with mutations in both CEBPA alleles (biallelic) have high CR rates compared to those with mutations in only one allele.35 These prognostic factors are considered when selecting an appropriate treatment regimen.   1.3.3 Standard treatment The conventional induction therapy consists of 3 days of an anthracycline such as Daunorubicin and 7 days of Cytarabine.36 Response to this therapy is assessed after 21 days, where CR is achieved in 60-80% of patients between ages 18-60 years.15 Post-remission therapy either entails consolidation chemotherapy or high-dose chemotherapy followed by an autologous or allogeneic hematopoietic stem cell transplant (SCT).15,36 An autologous SCT is performed on patients with favourable or intermediate cytogenetics, while an allogeneic SCT is performed on patients with high-risk genetics.37 An allogeneic SCT has the lowest rates of relapse, where the overall survival rate is approximately 30% higher compared to consolidation chemotherapy.38 Patients who do not respond to initial induction therapy are considered to have primary refractory disease, and are recommended for an allogeneic SCT.39    7 1.4 Genetics of AML The first insights into the genetic basis of leukemia began in 1973 when Janet Rowley discovered a recurrent translocation between chromosome 9 and 22, namely the Philadelphia chromosome translocation, in Chronic Myeloid Leukemia (CML) cells. In that same year, Rowley identified a translocation between chromosomes 8 and 21 in AML cells and over the next decade many more chromosomal aberrations were detected in hematologic malignancies.40 The most common chromosomal abnormalities that have been identified in AML include t(8;21)(q22;q22) and inv(16)(p13;q22), which are considered core-binding factor (CBF) AMLs and t(15;17)(q22;q11), which is characteristic of acute promyelocytic leukemia (APL).41 These structural rearrangements are established indicators for a favourable outcome, and these subtypes account for approximately 20% of all AML cases.40,41 For example, 85-95% of patients with these subtypes respond to therapy and reach CR, whereas only 75-80% of all AML patients achieve CR. Similarly, the 5-year survival rate for patients with these subtypes is 60-70% compared to 45% of all AML patients.22 The structural alterations that are known predictors of a poor outcome include translocations involving 11q23 (KMT2A), complex karyotypes with three or more chromosomal abnormalities, and loss or partial loss of chromosomes 5 or 7.40 Patients with these distinct cytogenetic alterations are treated accordingly. However, approximately 50% of patients are diagnosed with AML in the absence of cytogenetic abnormalities, and each case responds differently to conventional chemotherapy demonstrating the need for improved risk stratification.32    8 1.4.1 Core-binding factor leukemias  The core binding factors (CBFs) are a set of hematopoietic transcription factors that are essential for hematopoiesis, and control the regulation of genes associated with lymphoid and myeloid differentiation.12 CBF is a heterodimer that includes a CBF-beta (CBFB) subunit and an alpha subunit encoded by one of the RUNX family genes. CBFB is essential for CBF function as it increases the affinity of the RUNX proteins to bind to DNA, and protects them from proteolytic degradation.42 The chromosomal rearrangement t(8;21), discovered in 1973, and inv(16), found in 1982, disrupt the CBF transcription complex and give rise to RUNX1-RUNX1T1 and CBFB-MYH11 fusion transcripts, respectively.43 The expression of these fusion genes interferes with the normal CBF allele, acting as a dominant negative inhibitor of CBF-mediated transcription and resulting in a loss-of-function for CBF that prevents hematopoietic cell differentiation.12 RUNX1-RUNX1T1 accounts for 10% of AML cases and is associated with neutrophil maturation. The fusion protein CBFB-MYH11 accounts for 5% of AML cases and produces abnormal eosinophils.43 The treatment is usually intensive chemotherapy and the rate for CR in these cytogenetic groups is approximately 85%. Approximately 20-45% of patients with CBF-AML contain KIT mutations, which have been shown to negatively impact their prognosis.41   1.4.2 Acute promyelocytic leukemia  In 1977, Rowley demonstrated that t(15;17) was a determinant for APL. The resulting fusion transcript PML-RARA blocks hematopoietic differentiation at the promyelocyte stage.40 The treatment for this specific fusion is high-dose all-trans-retinoic acid (ATRA), which induces proteolysis of PML-RARA and drives the differentiation of the promyelocytes into mature granulopoietic cells.44 More than 80% of APL cases are cured with this treatment regimen.45   9 Treatment with arsenic trioxide has also been shown to be effective in targeting the PML-RARA fusion protein for degradation.44  Studies in mouse models have shown that these loss-of-function mutations in hematopoietic transcription factors are not sufficient to cause AML on their own.12 Transgenic mice that express either the RUNX1-RUNX1T1 or CBFB-MYH11 fusion proteins in adult hematopoietic progenitors do not develop AML unless a chemical mutagen such as ethyl-nitrosourea (ENU) is used to initiate leukemogenesis.46,47 The expression of PML-RARA in transgenic mice has shown some features of APL, however, there is a long latency of the disease and incomplete penetrance.48,49 Typically, a second mutation hit increasing proliferation and survival is required for the development of AML.12   1.4.3 11q23 rearrangements (KMT2A) Reciprocal translocations on chromosome 11q23 involving the KMT2A gene are reported in 10% of adult AML cases.43 The KMT2A gene encodes a multi-domain protein that regulates transcription of development genes such as the HOX gene family.50 These translocations create oncogenic fusion proteins in the methyltransferase domain, resulting in hypermethylation at H3K79 by the histone methyltransferase DOT1L. This leads to aberrant expression of genes including HOXA9 and MEIS1 that drive leukemogenesis.51 In 2008, the prognosis for patients with t(11q23) was thought to be unfavourable for every fusion partner of KMT2A. However, cases with t(9;11)(p21;q23) leading to a KMT2A-MLLT3 fusion have recently been shown to have a better prognosis.52,53 Similar results were observed in AML cases with t(11;19)(q23;p13) involving MLLT1.54 In contrast, patients with t(6;11)(q27;q23) and t(10;11)(p12;q23) involving   10 the MLLT4 and MLLT10 genes, respectively, have been shown to have a poor prognosis.32,54 Current treatment is limited to standard chemotherapy and an allogeneic SCT.50  1.4.4 Complex karyotypes, trisomies, and monosomal karyotypes A complex karyotype is defined as the presence of three or more acquired chromosomal aberrations and account for 10-15% of AML cases.43 They are characterized by chromosomal losses and gains, and are frequently associated with TP53 mutations, which negatively impact survival.55 Trisomies are commonly detected in a complex karyotype, although can also be detected alone or with other structural rearrangements. For example, trisomy 8 has been detected in a complex karyotype, as a second mutation in t(8;21), inv(16) or t(9;11), and as a sole chromosomal aberration. Patients with trisomy 8 or a complex karyotype including trisomy 8 are expected to have a poor treatment outcome, whereas patients with trisomy 8 in the presence of t(8;21), inv(16) or t(9;11) are predicted to have better responses to standard treatment.30 The trisomies in de novo AML in decreasing frequency are  +8, +22, +13, +21, and +11.28 Autosomal monosomies are also present in complex karyotypes, where the most prevalent single monosomies reported in AML are of chromosomes 5 and 7. Loss of entire chromosomes (-7, -5) has been reported to have lower OS, compared to del(7q) and del(5q).56 Moreover, a monosomal karyotype defined as two or more autosomal monosomies or a single monosomy accompanied by a second structural abnormality has been shown to be a better indicator for poor prognosis compared to a complex karyotype.28,56     11 1.4.5 Recurrent mutations Cytogenetic analysis still remains the most important indicator for AML risk stratification, but with advances in sequencing technology, molecular analysis became an appealing method to aid in stratifying those cases with normal cytogenetics.16 Recurrent mutations that carry prognostic value have been identified and have allowed for refinement of cytogenetic risk classification. However, the current classification scheme is not completely accurate suggesting that a better understanding of the underlying genetics relevant to AML pathogenesis is required.57 NPM1 mutations  NPM1 is a nucleolar protein that is involved with ribosome biogenesis, DNA repair, and apoptosis.43 In 2005, Falini and colleagues reported aberrant cytoplasmic localization of the NPM1 protein and discovered a four base-pair insertion that resulted in a frameshift and created an export signal.58 This mutation mediates abnormal localization of the NPM1 protein to the cytoplasm, thereby disrupting normal shuttle function of the protein between the nucleus and the cytoplasm.59 NPM1 is one of the most frequently mutated genes in AML, where mutations in NPM1 are identified in 25-35% of AML patients. In particular, NPM1 mutations are observed in 45-65% of CN-AML cases17 and are commonly accompanied by an ITD in FLT3.57 The prognosis for patients with mutated NPM1 without FLT3-ITD is favourable, whereas the prognosis for patients with a NPM1 mutation and an ITD in FLT3 is associated with a less favourable outcome.26,34 NPM1 mutations are also correlated with a better outcome in older patients, thereby indicating that such patients may benefit from intensive chemotherapy.60    12 CEBPA mutations  CEBPA is part of the leucine zipper (LZ) family of transcription factors. These proteins activate myeloid gene expression to induce granulocytic differentiation.35 Mutations in CEBPA are detected in 10-18% of CN-AML patients.43 There are two types of mutations frequently observed in the CEBPA gene: nonsense mutations in the N-terminus and in-frame mutations in the C-terminus. The nonsense mutations result in a truncated protein that inhibits wild-type CEBPA in a dominant negative gain-of-function manner. The in-frame mutations occur in the LZ domain, which prevents DNA from binding to CEBPA.61,62 Both types of mutations disrupt the normal role of CEBPA, which is involved in neutrophil differentiation.59 CEBPA mutations can either be monoallelic or biallelic.63 In 60% of CEBPA-mutated CN-AML cases, these mutations are biallelic and are associated with a specific gene expression signature that confers a favourable prognosis.35,64,65 Single mutations in CEBPA do not carry any prognostic value; although have been reported to coexist frequently with FLT3-ITD and NPM1 mutations.63 FLT3 mutations Nakao and colleagues discovered the first mutation in FLT3 in 1996.66 FLT3 is a hematopoietic growth factor receptor and functions via auto-inhibition.22 ITDs in FLT3 have been reported in 20% of all AML cases and in 28-34% of CN-AML cases.57 CN-AML patients with FLT3-ITD are predicted to have a poor outcome.67 These FLT3-ITD mutations are in-frame duplications that usually occur within the juxtamembrane (JM) domain and vary in length from 3 to > 400 base pairs with longer length correlated with poorer survival.68 Additionally, a point mutation at aspartic acid residue 835 (D835) in the activation loop of the tyrosine kinase domain (TKD) of FLT3 has been detected in 11-14% of CN-AML patients; although its effects on   13 prognosis have been controversial.43,69 Mutations in these two domains diminish the gene’s auto-inhibitory function resulting in constitutive activation of the FLT3 tyrosine kinase leading to enhanced RAS, MAPK, and STAT5 signaling.22 Multiple studies have also reported that patients with a higher allelic ratio of FLT3-ITD to wild-type FLT3 have a worst clinical outcome.70 An allogeneic SCT is an attractive option for those harbouring FLT3-ITD mutations since standard chemotherapy usually gives poor results.57 Furthermore, several tyrosine kinases inhibitors such as Sorafenib, Midostaurin, and Quizartinib are being investigated for their activity against cells with FLT3-ITD.71-73 KIT mutations KIT is another tyrosine kinase receptor that is expressed on the surface of 80% of AML cells.74 Mutations are detected in 25-30% of CBF AML cases with either t(8;21) or inv(16) and are associated with high relapse rates.75 The most frequent mutation reported is in the activation-loop at aspartic acid 816 (D816) resulting in increased STAT3/STAT5 and PI3K/Akt signaling. Another mutation in exon 8 at asparagine 822 (N822) results in increased MAPK and PI3K/Akt signaling.12,43 Treatment with the tyrosine kinase inhibitor Dasatinib is effective for KIT D816 and Imatinib shows activity against KIT N822.41,76 RAS mutations The RAS family proteins are a group of small GTPases involved in hematopoiesis.43 AML has been associated with the constitutive activation of RAS from mutations in upstream tyrosine kinase receptors FLT3 and KIT as well as mutations in NRAS and KRAS at codons 12,   14 13, or 16.77 The mutations in NRAS and KRAS are detected at a low frequency and not useful in predicting response to therapy.12,22 KMT2A partial tandem duplications After discovery of multiple translocations involving the KMT2A gene, partial tandem duplications (PTDs) that are not visible via karyotype analysis were detected.78 The DNA binding motifs of KMT2A between exons 5 and 11 or exons 5 and 12 are duplicated and inserted in-frame into intron 4.43 KMT2A-PTDs are detected in 8-10% of CN-AML cases and are classified as unfavourable markers.79 The recommended treatment is an early allogeneic SCT.43 RUNX1 mutations RUNX family proteins have an essential role in regulation of gene expression through transcriptional repression and epigenetic silencing.80 The RUNX1 gene encodes the alpha subunit of the CBF complex and is involved in the t(8;21) fusion transcript. Missense, nonsense, and frameshift mutations were subsequently observed in RUNX1 and have also been shown to disrupt this complex.59 RUNX1 mutations occur more in older patients compared to younger patients and rarely co-occur with NPM1 or CEBPA mutations.81 RUNX1 mutations have been shown to coexist with KMT2A-PTD and IDH1/2 mutations, and are associated with trisomy 21 and 13.17 Multiple studies have shown that RUNX1 mutations correlate with low CR rates and a worse OS.82 Furthermore, heterozygous germline mutations in RUNX1 have been observed in familial platelet disorder, which has autosomal dominant inheritance and a predisposition to AML.43 Acquired somatic mutations in the Runt domain have also been found in MDS.83,84     15 DNMT3A mutations DNMT3A is an epigenetic regulator that catalyzes the addition of a methyl group to cytosine residues.43 In 2010, Ley and colleagues identified a mutation in DNMT3A at arginine residue 882 (R882) and further mutational analysis revealed a variety of mutations in the open-reading frame; still, heterozygous mutations at R882 account for 50% of DNMT3A mutations.85 It is thought that the R882 mutation acts as a dominant-negative regulator of the wild-type DNMT3A.86 Mutations in DNMT3A are present in 20-25% of all AML patients and 36% of CN-AML patients. These mutations commonly co-occur with NPM1 and FLT3 mutations and are linked to an unfavourable prognosis.85 IDH mutations   IDH1 and IDH2 are enzymes in the citrate acid cycle that convert isocitrate to α-ketoglutarate.43 IDH1 mutations were initially found in gliomas and now heterozygous mutations in AML have been observed with point mutations at arginine residues in IDH1 at codon 132 and in IDH2 at codon 140 or 172.87 Mutations in either IDH1 or IDH2 are detected in approximately 15-20% of all AML cases, while they are observed in 25-30% of CN-AML cases.88 The mutant enzymes acquire a neomorphic function that coverts α-ketoglutarate to 2-hydroxyglutarate (2HG).89,90 High concentrations of 2HG have been shown to inhibit histones and DNA demethylases, since they require α-ketoglutarate as a substrate.91 In addition, overexpression of IDH enzymes has been shown to block differentiation by inducing histone and DNA hypermethylation.92,93 Multiple studies have shown that IDH1/2 mutations and NPM1 mutations are frequently observed together and that IDH1/2 and TET2 are mutually exclusive, suggesting that IDH1/2 and TET2 mutations share a proleukemogenic effect.34   16 TET2 mutations TET2 is an epigenetic regulator that uses α-ketoglutarate as a substrate to convert 5-methylcytosine to 5-hydroxymethylcytosine.43 Mutations of the TET2 gene at chromosome 4q24 have been observed in 8-23% of adult AML cases and in particular 18-23% of CN-AML cases.94 TET2 mutations are also common in MDS and MPN.43 The type of mutation varies including nonsense mutations, deletions, and splice site mutations across all 11 exons. These mutations usually introduce a frameshift that results in a truncated TET2 protein and therefore inadequate production of this suspected tumour suppressor.59,95 It has been shown in mouse models that deletion of TET2 results in increased self-renewal ability of hematopoietic stem cells and myeloid malignancy progression.96 ASXL1 mutations ASXL1 is part of the Polycomb group of proteins and acts as a transcriptional regulator.43 Mutations in ASXL1 were first identified in 2009 and account for 6-30% of AML cases.97 ASXL1 mutations are more frequent in older AML patients (> 60 years of age), and are more often present in patients with sAML.98 Somatic deletions and point mutations are the most common types of mutations observed and have been associated with an unfavourable prognosis.97 ASXL1 is thought to have a role in the recruitment and stabilization of the Polycomb-repressive complex 2 (PRC2) and has been shown in a mouse model with hematopoietic NRAS overexpression to accelerate disease progression and increase tumour burden.99,100     17 Other gene mutations  Mutations in the WT1 tumour suppressor gene, encoding a zinc finger transcription factor, are detected in 10-13% of CN-AML cases, although their effect on prognosis is not clear.17,101 The PHF6 tumour suppressor gene, located on the X chromosome encoding a plant homeodomain protein, is frequently mutated in T-cell acute lymphocytic leukemia (T-ALL).43,102 PHF6 mutations account for 3% of AML cases, although its functional role in AML is not well understood.102 Moreover, recurrent mutations in genes involved in the cohesin complex and the spliceosome complex have been recently reported. The components of the cohesin complex are involved in controlling chromatid separation during mitosis and transcriptional regulation of cell maturation.43 Three genes in the cohesin complex, STAG2, SMC3, and SMC1A, are recurrently mutated and mutually exclusive of each other suggesting that one mutation is sufficient to disrupt the entire complex.20 Mutations in genes of the spliceosome complex have been found in MDS and therefore are more prevalent in sAML. Mutations in SF3B1, U2AF1, SRSF2, and ZRSR2 have been reported in de novo AML.103 Due to the vast amount of knowledge pertaining to recurrently mutated genes and their impacts on treatment responses, Patel and colleagues constructed a revised risk stratification chart from integrated cytogenetic and mutational analysis in 2012 (Table 1.2). This revision requires validation before it can be incorporated into the WHO classification scheme.34       18 Table 1.2: Revised risk stratification in 2012 Cytogenetic Classification Mutations Overall Risk Profile Favourable Any Favourable Normal Karyotype or intermediate-risk cytogenetic lesions FLT3-ITD-negative Mutant NPM1 and IDH1 or IDH2 FLT3-ITD-negative Wild-type ASXL1, MLL-PTD, PHF6, and TET2 Intermediate FLT3-ITD-negative or positive Mutant CEBPA FLT3-ITD-positive Wild-type MLL-PTD, TET2, and DNMT3A and trisomy 8-negative FLT3-ITD-negative Mutant TET2, MLL-PTD, ASXL1, or PHF6 Unfavourable FLT3-ITD-positive Mutant TET2, MLL-PTD, DNMT3A, or trisomy 8, without mutant CEBPA Unfavourable Any *Adapted from Patel et al., 2012  19 1.5 Results from The Cancer Genome Atlas project  The Cancer Genome Atlas (TCGA) consortium set out to illustrate the genetic and epigenetic landscape of de novo AML by sequencing 200 genomes and transcriptomes. An average of 13 genes were mutated in each AML genome, and in 99% of cases at least one potential driver mutation was reported.33 Mutations were analyzed for co-occurrence and mutual exclusivity to establish mutational patterns. KMT2A fusions and PML-RARA fusions were found to require fewer cooperating mutations compared to other initiating mutations. Mutation prevalence was high in the known genes including DNMT3A, FLT3, NPM1, IDH1, IDH2, and CEBPA, but also some genes recently connected to AML development including U2AF1, EZH2, SMC1A, and SMC3.33 Mutual exclusivity was exhibited between the PML-RARA, CBFB-MYH11, and KMT2A fusions. RUNX1 and TP53 mutations were mutually exclusive of FLT3 and NPM1 mutations. Moreover, FLT3, other tyrosine/serine-threonine kinases, and RAS family genes were mutually exclusive. Mutations in FLT3, NPM1, and DNMT3A significantly co-occurred.33 This shows the heterogeneity of AML and the challenges in defining which mutations are of prognostic value. Moreover, many recurrent mutations could not be classified according to the traditional concept of Class 1 and Class 2 mutations, and instead were divided into nine different functional categories (Table 1.3).43     20 Table 1.3: Improved functional categories of genetic changes in AML Analysis Before 2008 2008-2012 From 2013  Cytogenetic and molecular genetic analysis Next-generation sequencing approaches The Cancer Genome Atlas project Prevalence in AML (%) Functional categories Class 1: activated signaling—FLT3, KIT, RAS mutations Class 1: activated signaling—FLT3, KIT, RAS mutations Class 1: transcription factor fusions—t(8;21), inv(16), t(15;17), KMT2A 18% Class 2: nucleophosmin 1, NPM1 mutations 27% Class 3: tumour suppressor genes—TP53, WT1, PHF6 mutations 16% Class 2: transcription and differentiation—t(8;21), inv(16), t(15;17), CEBPA, RUNX1 mutations Class 4: DNA-methylation-related genes: DNA methyltransferases—DNMT3A  DNA hydroxymethylation—IDH1, IDH2, TET2 44% Class 2: transcription and differentiation—t(8;21), inv(16), t(15;17), CEBPA mutations Class 5: activated signaling genes—FLT3, KIT, RAS mutations 59% Class 6: chromatin-modifying genes—ASXL1, EZH2 mutations, KMT2A fusions and PTDs 30% Class 3: epigenetic modifiers—TET2, DNMT3A, ASXL1 mutations Class 7: myeloid transcription factor genes—CEBPA, RUNX1 mutations 22% Class 8: cohesin complex genes—STAG2, RAD21, SMC1, SMC2 mutations 13% Class 9: spliceosome complex genes—SRSF2, U2AF35, ZRSR2 mutations 14% *Adapted from Meyer & Levine 2014  21 1.6 Gene expression profiling  In 1999, Golub and colleagues demonstrated that gene expression profiling could be used to predict subtypes in AML based on their molecular signatures.104 Nowadays, gene expression profiling is widely used in research to predict specific cytogenetics and molecular alterations as well as to discover novel subtypes with distinct gene expression patterns. This method is robust since multiple studies have confirmed the ability to predict subtypes with different platforms and bioinformatics tools.105 For example, Schoch et al.106 and Valk et al.105, independently showed that large structural rearrangements in AML such as t(15;17)/PML-RARA, t(8;21)/RUNX1-RUNX1T1, and inv(16)/CBFB-MYH11 were distinct from other subtypes using gene expression analysis. Cases with complex karyotypes were also separated from other subtypes, specifically due to the up-regulation of DNA repair genes.105,106 Furthermore, NPM1-mutated cases exhibited high expression of the HOX gene cluster, whereas mutated-CEBPA cases were correlated with the down-regulation of HOXA and HOXB and up-regulation of erythroid-specific genes such as GATA1 and EPOR.107,108 Moreover, high expression of single genes such as ERG and EVI1, have been shown to predict poor outcomes in AML patients.109,110 Although gene expression has proved to be beneficial in a research setting, it has yet to be implemented into clinical practice.111  1.7 Summary  The discovery and characterization of recurrent mutations in NPM1, FLT3, and CEBPA were breakthroughs for stratifying AML patients with normal cytogenetics.34 More recently, mutations in epigenetic modifiers such as DNMT3A, IDH1/2, and TET2 have been identified and are helping to improve the classification scheme.100 However, a complete understanding of this disease is still elusive, and a better classification scheme, specifically for the normal cytogenetics   22 class, is required. Molecular classification based on gene expression profiling is an attractive way to distinguish between subclasses in one comprehensive analysis and has identified novel subtypes and single gene prognostic predictors in earlier studies.108 The goal of my project was to use gene expression profiling to categorize novel subclasses in the normal cytogenetics class, and identify fusions in this group that cannot be detected using standard karyotype analysis.   1.8 Hypothesis Distinct gene expression and mutation profiles define novel subclasses of CN-AML and may be useful in predicting prognosis.  1.9 Thesis objectives 1) Examine gene expression patterns in CN-AML cases compared to established subtypes and report mutation profiles in known disease genes 2) Define distinct subclasses of CN-AML based on gene expression profiles and identify novel fusions in each subclass 3) Determine the differentially expressed genes that discriminate CN-AML subclasses     23 Chapter 2: Methods  2.1 Summary RNA-Seq was performed on samples of bone marrow or peripheral blood on 124 patients with de novo and secondary AML, of which 53 were CN-AML patients, from the Personalized Medicine Project (PMP) cohort. The gene expression was quantified and cluster analysis was performed on all patients (n=124) and CN-AML patients (n=53). RNA-Seq samples were assembled using a de novo transcriptome assembly tool and structural variants were detected. Differential expression analysis comparing the subgroups identified by clustering was performed, and discriminatory genes defining each subgroup were determined.   2.2 Primary AML samples Primary human AML samples were isolated from bone marrow or peripheral blood from consenting patients with AML in the Leukemia/Bone Marrow Transplant Program of British Columbia (BC). The BC Cancer Agency’s Clinical Review Ethics Board approved the collection and use of human tissue for this study. All patients were between ages 25 and 75, and all samples were collected between December 30th, 1992 and November 11th, 2011.    2.3  Library construction and sequencing  Plate-based libraries were prepared at the BC Cancer Agency’s Genome Sciences Centre (BCGSC) following the vendor’s paired-end (PE) protocol, and the libraries were sequenced on Illumina HiSeq2000 instruments to generate PE sequences with 75 base pair (bp) reads. Refer to Methods section in Docking et al. (manuscript in preparation).   24 2.4 Gene expression quantification  Sailfish, an alignment-free abundance normalization method, was used to quantify gene expression.112 This approach uses the concept of k-mer counting and is executed in two phases. First, an index is built from a set of reference transcripts and a specific k-mer length. Second, the indexed k-mers from the reference transcripts are counted in the RNA-Seq reads and the relative abundance of each transcript is estimated in reads per kilobase per million mapped reads (RPKM). Sailfish v0.6.3 was used in this analysis using the following two steps including parameters: 1. Prepare reference ./sailfish –index refseq_transcript_reference.fasta –output reference_path –k 20 2. Calculate transcript abundance ./sailfish –quant reference_path –iterations 1000 –min_abundance 0 –delta 0.005 --out output_path  2.5 Unsupervised consensus clustering and survival analysis Consensus clustering is a quantitative method for estimating the number of unsupervised classes and is derived from repeated subsampling and clustering. A data matrix containing samples (items) as rows and gene features as columns was generated. The variance of each gene was calculated and genes with a variance equal to zero were removed. The dataset was reduced to the top 5000 most variable genes by computing the median absolute deviation, and this filtered matrix was used as input to ConsensusClusterPlus v1.20.0 in R v3.1.3. The following parameters were used:  maxK=10   25 reps=10000 pItem=0.8 pFeature=1 distance=“pearson” or “spearman” clusterAlg=“hc” or “pam” The variable, k, is defined as the number of clusters, where a max of 10 clusters were evaluated. For consensus clustering, 10000 repetitions of resampling were performed on 80% of samples or items (pItem) and 100% of gene features (pFeature). Two distance parameters and two clustering algorithms were explored to optimize the consensus clustering result. A consensus matrix with the samples as both the rows and columns was produced for each k, where consensus values are calculated by determining the number of times a pair of samples are assigned the same cluster in 10000 iterations. A heatmap of the consensus matrix was generated to determine the optimal cluster solution, where the consensus values range from 0 (never cluster together – white) to 1 (always clustered together - blue). The cluster assignments are marked by coloured rectangles, and a dendrogram showing the degree of similarity between samples was built. Cumulative distribution function (CDF) plots of the consensus matrices for each k were estimated by a histogram of 100 bins. A consensus matrix with only 0’s and 1’s would generate a histogram with two bins centered at 0 and 1, and a CDF curve with a step around 0, a flat line from 0-1, and a second step around 1. The cluster assignments for samples were recorded at each k and plotted in an item-tracking plot, to evaluate the stability of the clusters.113 The silhouette-width of each sample in an assigned cluster was computed by comparing how similar the sample is to the other samples in the same cluster. A value of 1 indicates the sample fits well in the cluster, whereas a value of near 0 indicates the sample does not share many similarities with the other samples in   26 the same cluster. A silhouette-width profile was generated using R ‘cluster’ package v2.0.3 by reordering samples to display the same order as the consensus matrix heatmap. Kaplan-Meier survival curves using the R ‘survival’ package v.2.38-2 were generated and the log-rank test was used to compute a p value for overall survival, by testing the null hypothesis that there is no difference between population survival curves. A vertical gap indicates that one group had a greater fraction of patients surviving.  2.6 Transcriptome assembly and variant analysis   RNA-Seq libraries were assembled with Trans-ABySS (version 1.5.2 – http://www.bcgsc.ca/platform/bioinfo/software/trans-abyss/releases).114 Contigs were aligned against the human genome reference hg19 with exon-exon junctions using GMAP (version 2014-12-28 – https://www.hpc.science.unsw.edu.au/software/gmap/2014-12-28). Variants were detected using PAVfinder (version 0.2.0 – https://github.com/bcgsc/pavfinder). Gene fusion candidates were filtered by requiring at least four reads spanning the contig breakpoint with at least four flanking base pairs. Alignment of contigs and reads spanning the fusion breakpoint to the human genome reference 19 was manually performed for each gene fusion candidate using the BLAT alignment tool from UCSC Genome Browser (https://genome.ucsc.edu/cgi-bin/hgBlat). The cBioPortal for Cancer Genomics online database was used to visualize the fusion events as a heatmap or an OncoPrint (http://www.cbioportal.org/oncoprinter.jsp). The genes involved in the fusions were compared to the reported mutations from the TCGA AML cohort in cBioPortal.115,116    27 2.7 Cutoff optimization and survival analysis of significant genes Determination of a cutoff value of gene expression was computed by fitting a Cox proportional hazards model to the gene variable and the survival variable. The optimal cutoff was selected by maximizing the significance assessed by the log-rank test that stratified patients into two separate groups.117 Kaplan-Meier analysis was performed as described in Section 2.5.   2.8 Differential expression analysis  Statistically significant changes in gene expression between clusters were analyzed using ‘samr’ v2.0 in R v3.1.3, which is an RNA-Seq adaptation of Significant Analysis of Microarrays (SAM). A multi-class SAM analysis was performed using the top 5000 most variable genes previously determined and described in Section 2.5. The following parameters were used:  resp.type=“Multiclass” random.seed=100 nperms=1000 testStatistic=“wilcoxon” fdr.output=0.05 The top-ranked 20% of genes were filtered from the SAM output files and abundance heatmaps were generated using ‘pheatmap’ in R v3.1.3. The RPKM gene expression matrix was also filtered to keep the same gene records, and then each row (gene) was log-transformed and reordered in the heatmap using hierarchical clustering.118 Multiple two-class (unpaired) SAM analyses were performed to obtain a list of differentially expressed genes between groups.  The following parameters were used:  resp.type=“Two class unpaired”   28 random.seed=100 nperms=200 testStatistic=“wilcoxon” fdr.output=0.05  2.9 Comparison of significant genes  The groups within the two datasets were tested for overlap of genes with high expression and genes with low expression. A contingency table was built and a Fisher’s exact test was used to compute a p-value. These p-values were adjusted using the Benjamini-Hochberg correction. The Fisher’s exact test also computed an odds ratio that represents the strength of the association, where an odds ratio > 1 indicates there is strong association between the two lists.   2.10 Group-discriminatory genes  A random forest classifier was generated for samples in each group versus all other samples to identify discriminatory genes using randomForest v6.6-6 in R v3.1.3. Each classification tree was built using a bootstrap sample of the data, and at each split a random subset of the variables was used. The number of trees built was 50000 and the number of variables sampled at each split (mtry) was 100. At each split, one variable was tried and a Gini index was calculated by measuring the total decrease in node impurity, averaged over all trees. To select the discriminatory genes, iteratively a new forest is built after discarding genes with the smallest variable importance. The smallest set of genes selected is the one that yields the smallest out-of-bag (OOB) error rate.  For each classifier, the estimated OOB error was calculated as a   29 function of the number of most important genes. The smallest set of genes that minimized the OOB error rate was reported.119 The following parameters were used:  y=response data=expression.data ntree=50000 mtry=100 importance=TRUE proximity=TRUE keep.forest=TRUE na.action=na.omit         30 Chapter 3: Results  3.1 Overview AML is a heterogeneous disease with many classes including core-binding factor rearrangements inv(16) and t(8;21), complex karyotypes, and CN-AML. The remarkably normal karyotype of CN-AML suggests that aberrant gene expression driven by molecular aberrations, rather than large chromosomal alterations may play a dominant role in driving tumourigenesis. The overall objective of this work was two-fold: 1) to define potential CN-AML subclasses based on gene expression signatures, and 2) to identify whether additional fusions, undetectable via karyotype analysis, existed within these subclasses. Karyotype analysis provides a global evaluation of both numerical and structural abnormalities within chromosomes. However, current G-banding techniques are limited in resolution, only able to detect rearrangements greater than 5 million base pairs (bp) apart. Structural variants smaller than 5 million bp, such as ITDs in FLT3 and PTDs in KMT2A, have been detected in AML using next generation sequencing techniques and are essential for predicting prognosis.67,68 This suggests that gene fusions resulting from chromosomal alterations smaller than 5 million bp exist and may provide additional insight into AML pathogenesis. I tested two hypotheses, the first addressing whether CN-AML could be separated from other AML subtypes using their gene expression profiles, and the second testing whether CN-AML could be split into multiple subgroups based on their gene expression and mutation signatures.  To test these hypotheses, I analyzed two AML cohorts from the Personalized Medicine Project (PMP) and The Cancer Genome Atlas (TCGA) project (Figure 3.1). For my purposes, the PMP cohort was considered the test cohort and consists of patients with de novo AML and   31 sAML. Conversely, the TCGA cohort served as a validation cohort and consists of patients with de novo AML. The PMP cohort (n=124) was used to test the first hypothesis of whether CN-AML is separated from other AML subtypes by gene expression signatures. The PMP cohort of CN-AML cases (n=53) was used to test the second hypothesis of whether CN-AML could be separated into subclasses by differences in gene expression and mutation profiles. The TCGA cohort of CN-AML cases (n=80) was used to validate the findings from the PMP cohort.   32  Figure 3.1: Workflow of the analyses performed This figure illustrates the two hypotheses of my thesis and the analyses that were used to test my hypotheses: 1) CN-AML subclasses can be identified and differentiated from known AML subtypes using gene expression profiling; 2) CN-AML subclasses have distinct gene expression and mutation profiles that may be useful in predicting prognosis.    33 3.2 Gene expression and mutation signatures define AML subtypes and CN-AML  The PMP cohort (n= 124) includes patients with the most common cytogenetic AML subtypes and reflects the heterogeneity of cytogenetic and molecular abnormalities in AML (Table 3.1). To determine whether CN-AML cases can be differentiated from known AML subtypes, I performed an unsupervised hierarchical cluster analysis using a cross-sample subset of the 5000 most variable genes. Consensus clustering of the gene expression abundances suggested an optimal cluster solution with five groups (Figure 3.2).  The majority of the sAML patients were classed in Group 5, which had a distinct gene expression signature compared to the other groups (Figure 3.2). Cases with t(15;17)/PML-RARA, inv(16)/CBFB-MYH11, and t(8;21)/RUNX1-RUNX1T1 (favourable risk group) were separated into Group 1. Interestingly, CN-AML cases were split into two distinct groups, Groups 3 and 4, indicating two separate subgroups exist within CN-AML. I noted that Group 3 was significantly associated with NPM1 and DNMT3A mutations, and Group 4 was significantly associated with CEBPA mutations (Figure 3.2).  To determine whether the clustering correlated with differences in survival, I performed a Kaplan-Meier survival estimate and generated a survival curve for each group. There was a significant survival difference (Kaplan-Meier analysis, log-rank test, p=2.28x10-2) between Group 1, enriched for patients in the favourable risk group, and Group 5, enriched for sAML patients (Figure 3.3). There was also a significant survival difference (Kaplan-Meier analysis, log-rank test, p=2.61x10-2) between Group 1 and Group 4, enriched for CN-AML patients (Figure 3.3). There was no significant difference in survival (Kaplan-Meier analysis, log-rank test, p=0.597) between Groups 3 and 4, enriched for CN-AML cases (Figure 3.3), which may be due to the relatively small sample size.   34 Table 3.1: Patient characteristics Variable Subtype n (%) Median (range) Age (years)   55.4 (24.8-76.2) Males  65 (52.4)  WHO Category AML-t(15;17) 5 (4)   AML-t(8;21) 4 (3.2)   AML-inv(16) 11 (8.9)   AML-inv(3) 1 (0.8)   AML_11q23 4 (3.2)   AML-t(7;11) 1 (0.8)   AML-del5(q) 1 (0.8)   AML-complex 4 (3.2)   AML-NK (CN-AML) 53 (42.7)   AML-NOS 3 (2.4)   AML-ERYT (M6) 14 (14.7)   AML-MAST 1 (0.8)   AML-MEGA 1 (0.8)   AML-MDS (sAML) 20 (16.1)   AML-MPN 1 (0.8)      35   Figure 3.2: Unsupervised gene expression patterns Unsupervised consensus clusters for de novo AML and sAML samples in the PMP cohort. Shown from top to bottom is the silhouette-width profile (a metric that reflects how well a sample fits into a distinct cluster) that was calculated from the consensus membership matrix; a gene abundance heatmap; and covariates including subtypes and observed mutations, with P-values for associations, corrected for multiple hypothesis testing at the far left. B-H denotes Benjamini-Hochberg multiple-testing correction. The numbers refer to the silhouette-width profiles for which P-values are provided. One asterisk denotes P<0.05, two asterisks denotes P<0.01, and three asterisks denotes P<0.001. The colour scale for the heatmap is row-scaled log10(RPKM+1) (reads per kilobase per million mapped reads). The scale-bar numbers (-4 for least abundant to 4 for most abundant) indicate the range of abundance values in the heatmap.   36   Figure 3.3: Kaplan-Meier survival analysis between groups  Kaplan-Meier survival analysis performed on groups from consensus clustering. The overall survival is measured in months and the number of patients surviving is displayed as a fraction. A legend displaying the colour of each group is shown in the top right corner of the plot. A Chi-square test was used to determine whether the grouping variable has a significant influence on survival time across all groups (p-value shown in bottom left corner). P-values were also calculated for each pair of groups and shown as a bar graph (right top corner). The red asterisk indicates a statistically significant P-value (P<0.05).       37 3.3 Gene expression profiles and novel fusions define distinct subclasses of CN-AML CN-AML patients have variable responses to standard chemotherapy, suggesting that the CN-AML group can be further stratified into subclasses. The previous analysis split the CN-AML cases into two groups that were significantly associated with NPM1 and DNMT3A mutations (Group 3), or CEBPA mutations (Group 4). These known mutations are established biomarkers that help predict response to treatment, relapse risk, and overall survival.30 This suggests that further stratification of CN-AML into more subclasses may identify additional biomarkers in patients with similar responses to treatment. To define CN-AML subclasses, I performed unsupervised hierarchical clustering of the 5000 most variable genes in CN-AML cases from the PMP cohort (n=53) and CN-AML cases from the TCGA cohort (n=80), separately. The PMP cohort was split into three groups and the TCGA cohort was split into four groups (Figure 3.4). For simplicity, from now on the groups in PMP will be referred to as P1, P2, and P3, and the groups in TCGA will be referred to as T1, T2, T3, and T4. Survival analysis suggested stratification of P2 from P1 and P3 (Figure 3.4A). Through log-rank statistics, there was a significant survival difference between P2 and P3  (Kaplan-Meier analysis, log-rank test, p=9.75x10-3), although there was not a significant survival difference between P2 and P1 (Kaplan-Meier analysis, log-rank test, p=5.58x10-2). In TCGA, there was a significant difference in survival between T3 and T4 (Kaplan-Meier analysis, log-rank test, p=2.67x10-2). These p-values are not corrected for multiple hypothesis testing, although all pairwise group comparisons are shown (Figure 3.4).  To explore the gene expression signatures of each group, I performed a multi-class differential expression analysis across groups using the SAM method. P1 was significantly associated (Fisher’s exact test, P<0.05) with mutations in NPM1, DNMT3A, and PTPN11 and P2   38 was significantly associated (Fisher’s exact test, P<0.05) with FLT3-ITDs (Figure 3.5). Mutations in myeloid transcription factors, CEBPA and RUNX1, were only observed in P2 and P3. In TCGA, no significant associations with known mutations were observed.  To identify fusion events specifically associated with the PMP groups, I performed de novo transcriptome assembly and structural variant detection on the RNA-Seq data from the CN-AML samples. Each potential fusion transcript was confirmed by aligning the contig and reads to the human genome (hg19) to determine if at least three reads spanned the breakpoint. In total, nine fusion transcripts were confirmed in the 53 CN-AML samples (Figure 3.6). Two recurrent fusions were observed: PIM3-SCO2 in four cases and OAZ1-DOT1L in three cases. These fusions were reported in samples from P1 and P3. Fusions such as KAT6A-CREBBP and TFG-GPR128 (reported in samples from P1) are known fusion transcripts120,121 that were possibly missed in the diagnostic karyotyping. Two other fusions, UBQLN1-HNRNPK and IQGAP1-ZNF774, were also observed in cases from P1. A fusion involving KAT6A and SORBS3 was observed in one case from P2 with 173 supporting reads. Fusion transcripts, RTF1-MAP1A and TTYH3-MAFK, were also observed two different samples from in P2 with 41 and 10 supporting reads, respectively (Table 3.2). The majority of fusions involved genes located on the same chromosome and < 5 million base pairs apart, suggesting these fusion transcripts may have arose through mechanisms of deletions or duplications. The genes composing the confirmed fusions were compared to the reported mutations from the TCGA online database (cBioPortal.org) and are reported in Table 3.3. A copy number alteration (homozygous deletion) in UBQLN1 and HNRNPK in one case from T1 was identified. In T3, a missense mutation in GPR128 and a truncating mutation in HNRNPK were identified in two separate cases. A copy number alteration (homozygous deletion) in PIM3 and SCO2 was identified in the same case in T4. Lastly, a copy   39 number alteration (amplification) in CREBBP and a TFG-GPR128 fusion was identified in two different cases in T4.  To determine if the gene expression of the genes composing the identified fusions had any impact on survival, I performed a Kaplan-Meier analysis on each gene to define a cutoff gene expression level that stratifies patients into two separate groups (recurrent fusions shown in Figure 3.7). For PIM3 and SCO2, the survival trends between the two cohorts were inconsistent. For example, high expression of SCO2 was significantly correlated (P=9.50x10-3) with poor survival in the PMP cohort. In contrast, high expression of SCO2 was significantly correlated (P=5.00x10-2) with better survival in the TCGA cohort. For OAZ1 and DOT1L, the survival trend between cohorts were alike, where OAZ1 expression did not have a significant impact on survival and high expression of DOT1L was significantly correlated with poor survival for the PMP cohort (P=0.013) and TCGA cohort (P=0.011). I concluded that assigning a cutoff expression value to a single gene is not a robust method of predicting survival; therefore I performed a differential expression analysis to identify gene sets that define each subgroup.            40   Figure 3.4: Consensus clustering of CN-AML samples  Consensus matrices, silhouette-width profiles and Kaplan-Meier survival estimates for CN-AML samples in the a) PMP and b) TCGA cohorts. The silhouette-width profiles were calculated from their respective consensus membership matrices. The Kaplan-Meier survival plots compare the survival curves of each group. The overall survival is measured in months and the patients surviving is displayed a fraction of the number of patients in each group. A legend displaying the colour of each group is shown in the top right corner of the plot. A Chi-square test was used to determine whether the grouping variable has a significant influence on survival time across all groups (p-value shown in bottom left corner of plot). P-values were calculated for each group pairs and shown as a bar graph in the top right corner, with asterisks indicating statistically significant differences before multiple hypothesis correction.    41   Figure 3.5: Unsupervised gene expression patterns for CN-AML cases Unsupervised consensus clusters for the CN-AML cases in the A) PMP and B) TCGA cohorts. Shown from top to bottom is the silhouette-width profile; a gene abundance heatmap; and covariates including observed mutations, with P-values for associations corrected for multiple hypothesis testing at the far left and right. B-H denotes Benjamini-Hochberg multiple-testing correction. The numbers refer to the silhouette-width profiles for which P-values are provided. One asterisk denotes P<0.05, two asterisks denotes P<0.01 and three asterisks denotes P<0.001. The colour scale for the heatmap is row-scaled log10(RPKM+1) (reads per kilobase per million mapped reads). The scale-bar numbers (-4 for least abundant to 4 for most abundant) indicate the range of abundance values in the heatmap.   42  Figure 3.6: OncoPrint of fusions identified in CN-AML cases  The identified fusions in CN-AML cases (n=53) from the PMP cohort. The first gene represents the 5’ gene and the second gene represents the 3’ gene in fusion partners. Each box in the OncoPrint represents a case and those boxes containing a triangle indicate that the fusion in the corresponding line is present in that sample. The percentage of cases containing the observed fusions was calculated and is reported beside the respective fusion gene. The recurrent fusions are enclosed in a red rectangle.                            43 Table 3.2: Fusions identified in CN-AML cases in the PMP cohort The 5’ gene and 3’ gene are listed for each fusion and the number of supporting reads is reported. Each fusion was manually confirmed using the UCSC Genome Browser BLAT alignment tool against the human genome reference 19 (hg 19). The group assignment from the cluster analysis is shown and genes in red signify that they were observed in more than case.   Sample Group  5'gene 3'gene Support Reads A08897 1 PIM3 SCO2 22 A08870 1 PIM3 SCO2 8 A08870 1 OAZI DOT1L 13 A08894 1 OAZI DOT1L 2 A08838 1 KAT6A CREBBP 34 A08885 1 TFG GPR128 165 A08855 3 PIM3 SCO2 5 A08879 3 PIM3 SCO2 5 A08868 3 OAZI DOT1L 2 A08855 3 UBQLN1 HNRNPK 12 A15353 3 IQGAP1 ZNF774 48 A08862 2 KAT6A SORBS3 173 A08864 2 RTF1 MAP1A 41 A08865 2 TTYH3 MAFK 10            44  Table 3.3: Genetic alterations in CN-AML cases from the TCGA cohort Mutations in genes involved in the confirmed fusions from the subsequent analysis of the PMP cohort are reported for the CN-AML cases from the TCGA cohort. These mutations include copy number alterations (homozygous deletions and amplifications) as well as fusions, missense, and truncating mutations.  Sample Group   Gene Mutation Type TCGA-AB-2971 1 UBQLN1 Homozygous deletion TCGA-AB-2971 1 HNRNPK Homozygous deletion TCGA-AB-2921 3 GPR128 Missense (A548T) TCGA-AB-2859 3 HNRNPK Truncating (L68Rfs*25) TCGA-AB-2955 4 PIM3 Homozygous deletion TCGA-AB-2955 4 SCO2 Homozygous deletion TCGA-AB-3009 4 CREBBP Amplification TCGA-AB-2877 4 TFG-GPR128 Fusion          45  Figure 3.7: Kaplan-Meier survival analysis of gene expression Kaplan-Meier survival plots for the A) PMP and B) TCGA cohorts for recurrent fusions identified in the previous analysis. A Cox proportional hazard model was fitted to each gene and survival variable, and the optimal cutoff was selected by maximizing the significance assessed by the log-rank test that stratified patients into two groups. The overall survival is measured in months and the number of patients surviving is displayed a fraction. P-values were calculated and shown in the top right corner of each plot. The red indicates high expression and the black indicates low expression.      46 3.4 Gene expression of discriminatory gene sets classify CN-AML  To identify differentially expressed genes between CN-AML subgroups, two-class unpaired differential expression (DE) analyses of target groups compared to all other groups were performed for both cohorts (Figure 3.8). For example, P1 was compared to P2 and P3; P2 was compared to P1 and P3, and so forth. The DE analysis identified genes with significant fold-changes (P<0.05) between groups including genes with both high expression (red) and low expression (green) (Figure 3.8). In total, 3419, 2872, and 105 significant genes were identified in P1, P2, and P3, respectively (Figure 3.8A). For TCGA, 4094, 1563, 900, and 3148 significant genes were identified for T1, T2, T3, and T4, respectively (Figure 3.8B)   To establish whether these differentially expressed genes were shared among groups between the two cohorts, multiple pairwise Fisher’s exact tests were performed to compute the probability of having a significant number of genes shared between group pairs (Figure 3.9). The genes with either high or low expression were compared independently. I identified a significant number of shared differentially expressed genes (Fisher’s exact test, P<0.001) between P1 and T1 and between P2 and T2 as well as P2 and T4 (Figure 3.10).   To identify the most biologically significant genes shared among groups between the two cohorts, I analyzed the top fifteen genes with the greatest positive and negative fold-changes (FCs) for each group in the PMP and TCGA cohorts (Figure 3.11). MEST (Mesoderm specific transcript), an imprinted gene encoding a hydrolase, was identified as highly expressed in both P1 and T1. GATA1 is a transcription factor that is involved in erythroid development and was up-regulated in P1 and down-regulated in P3 and T2. The basic leucine zipper transcription factor, MAFB, was down-regulated in P1 and T1 and up-regulated in P2 and T2. MAFB has an important role in the regulation of lineage-specific hematopoiesis. VCAN, a versican   47 proteoglycan involved in cell adhesion, proliferation and migration, was down-regulated in P1 and T1, and up-regulated in P2, T2, and T4. To determine gene sets that define each group, I performed random forest classification and identified the smallest number of genes that could accurately classify a group. The classifier for P1 contained eleven genes, and was characterized by high expression of DMXL2, FN1, and RAB1A (Figure 3.12, P1).  DMXL2 is a regulator of Notch signaling, and studies have implicated the Notch pathway as a tumour suppressor in myeloid malignancies.122 FN1 encodes a fibronectin involved in cell adhesion and migration, and RAB1A is a member of the Ras superfamily GTPases. P2 contained six genes in its classifier, two of which were zinc-finger proteins involved in transcription regulation. P2 was characterized by high expression of ARHGAP15 (Rho GTPase activating protein) and UBE2G2 (ubiquitin-activating enzyme), and low expression of MAST3 (microtubule associated serine/threonine kinase) (Figure 3.12, P2). P3’s classifier contained three genes including HBA1 and HBA2, which are part of the human alpha hemoglobin gene cluster (Figure 3.12, P3).  For TCGA, T1 was classified with seven genes including C17orf87, NLRP12, and CD4, which are involved in various immune responses, and SIRPB1, which is a signal regulatory protein involved in tyrosine kinase-coupled signaling processes (Figure 3.12, T1). The classifier for T2 contained three genes: EPB41, HBB, and ALAS2. EPB41 functions to maintain erythrocyte shape; HBB is the hemoglobin beta subunit; and ALAS2 is an enzyme that catalyzes the first step in the heme biosynthetic pathway (Figure 3.12, T2). T3 contained ten genes in its classifier, six of which are ribosomal proteins that catalyze protein synthesis, and two long non-coding RNAs, GAS5 and SNHG8 (Figure 3.12, T3). The classifier for T4 required seventeen   48 genes, including the tumour protein P53 binding protein 1 (TP53BP1) and an apoptotic inducing protein (ACIN1) (Figure 3.12, T4).      49  Figure 3.8: Differentially expressed genes in the PMP and TCGA cohorts Quantile plots of the observed relative difference versus the expected relative difference of gene expression in each group from the A) PMP and B) TCGA cohorts. The solid line denotes where the observed relative difference is identical to the expected relative difference. The dotted lines represent the threshold, Δ, and were calculated for an FDR = 0.05. Each dot represents a gene; the red and green dots represent significant differentially expressed genes that are outside the Δ threshold.     50  Figure 3.9: Significant associations of differentially expressed genes between groups Heatmaps illustrating the number of A) genes with high expression and B) genes with low expression shared between groups across the PMP and TCGA cohorts. The rows represent the gene lists for the groups in the PMP cohort and the columns represent the gene lists for the groups in the TCGA cohort. A Fisher’s exact test, corrected for multiple hypothesis testing using Benjamini-Hochberg, was used to test for a significant number of shared differentially expressed genes between each pair of groups’ gene lists and three asterisks denotes P<0.001. The colour scale for the heatmap is log2(Odds Ratio), where 0 (white) represents no association and 6 (dark blue) represents the strongest association.     51  Figure 3.10: Differentially expressed genes of significant group associations  Venn diagrams of the number of genes with A) high expression and B) low expression shared by each of the significant associations evaluated from the Fisher’s exact test. The cohort and group labels are beside each oval and the number of shared genes is overlaid on the diagrams.     52 Figure 3.11: Greatest fold-changes in groups from the PMP and TCGA cohorts (next 2 pages) Top 30 genes with the greatest positive and negative fold-changes between groups from each DE analysis performed for the A) PMP and B) TCGA cohorts. Each colour corresponds to a gene that was observed in both datasets at least once (legend in top right corner). The (a, b, c, d) notation shows the number of groups in which the gene had a positive FC or negative FC for each cohort, where a and b represent the number of PMP groups with positive and negative FCs, and c and d represent the number of TCGA groups with positive and negative FCs.            53     54   55 Figure 3.12: Discriminatory genes for groups in the PMP and TCGA cohorts (next 7 pages) Using random forest classifiers, discriminatory genes were identified using the RPKM data. Genes that help a classifier separate or discriminate a group of samples from all other samples tend to have higher or lower expression in samples in that group. For example, refer to the beeswarm/box-whisker plot for RPL7A (low) and RAB1A (high) for P1, in which each gene’s RPKM distribution for P1 is highlighted by a red rectangle. For each group, panels show (left to right, top to bottom): A) The importance of genes in a classifier are correlated to Kruskal-Wallis P-values for genes being differentially expressed; B) Profile of the estimated classifier error rate as a function of the number of most important 20 genes C) Table of discriminatory genes, ranked by importance (Gini), with genes that were discriminatory in bold D) Beeswarm/box-whisker plots of expression across the groups for a the top highly-ranked genes (up to 6) E) Estimated overall error rate    56    57    58   59    60     61    62   63 Chapter 4: Discussion   The overall objective of this thesis was to define subclasses of CN-AML. Given the heterogeneous nature of AML and variable treatment responses of CN-AML patients, I hypothesized that CN-AML would be separated into two or more subgroups and that these subgroups would have distinct gene expression and mutation profiles. Further stratification of CN-AML patients by their underlying genetic differences would be clinically significant for developing risk-adapted treatment approaches.  A cohort-level clustering analysis of 124 AML cases (including CN-AML) in the PMP cohort revealed that CN-AML samples separated into two subgroups based on distinct gene expression signatures. Of the two subclasses of CN-AML, the first (Group 3) was significantly associated with mutations in NPM1 and DNMT3A. This was notable, since NPM1-mutated AML has previously been reported to have a distinct molecular signature that is characterized by the activation of specific HOX cluster genes that are involved in hematopoietic development.123 Conversely, the second subgroup (Group 4) was significantly associated with biallelic mutations in CEBPA. CEBPA double mutations have also been linked to distinct gene expression profiles including down-regulation of HOXA and HOXB genes and up-regulation of GATA1 and EPOR, which are involved in erythroid differentiation.65,107 A number of studies have shown that biallelic CEBPA mutations and NPM1 mutations are mutually exclusive and therefore these mutated genes offer a possible explanation for why these two subgroups were separated into two distinct clusters in my analysis.124 This data shows that CN-AML is separated into at least two subgroups that are characterized by unique gene expression signatures.    64 The known cytogenetic subtypes including t(15:17)/PML-RARA for acute promyelocytic leukemia (APL) as well as t(8;21)/RUNX1/RUNX1T1 and inv(16)/CBFB-MYH11 for core-binding factor AML were also separated into a distinct cluster. This is consistent with previous studies, where other independent research groups have shown that these particular cytogenetic rearrangements are discriminated from all other cytogenetic subtypes.105,106 Valk et al. demonstrated that the growth factors, HGF, MST1, and FGF13 were specific to cases with t(15;17). They also reported that RUNX1T1 was discriminating for t(8;21) and MYH11 was discriminating for inv(16).22 These transcription factor fusions are prognostic predictors of a favourable outcome and my analysis corroborates this prediction, where Group 1 is enriched for these cytogenetic rearrangements, and has a better overall survival compared to the groups enriched with CN-AML and sAML cases (Figures 3.2 and 3.3). Although there was no significant survival difference between the two groups enriched with CN-AML samples, this may have been affected by the other samples that are present in these clusters. This data suggests that the two CN-AML groups, while genetically distinct, have a similar prognosis.  A clustering analysis of the CN-AML cases from the PMP and TCGA cohorts identified three subgroups and four subgroups, respectively. P2 was observed to have lower overall survival compared to P1 and P3, which is most likely explained by the fact that P2 was significantly associated with FLT3-ITDs. ITDs in FLT3 are known to adversely affect a patient’s clinical outcome and are considered when determining risk of CN-AML patients.125 In TCGA, there was a significant survival difference between T3 and T4, although no significant associations with mutations were identified. This suggests that several mutations are contributing to the overall survival of the patients in each group.   65 Fusions such as PML-RARA, RUNX1-RUNX1T1, and CBFB-MYH11 encode oncogenic chimeric proteins that have critical roles in driving AML tumourigenesis and are invaluable in predicting treatment response and overall survival.26 Some fusions are cytogenetically cryptic and thus are not detected using standard karyotype analysis; therefore I focused on identifying novel fusions between genes that are < 5 million base pairs apart. Transcriptome assembly and variant analysis identified novel fusions. Two recurrent fusions, PIM3-SCO2 and OAZ1-DOT1L, were present in P1 and P3 and absent in P2. The PIM3-SCO2 fusion has previously been reported as highly prevalent in childhood AML in approximately 80% of cases.126 PIM3 is a serine/threonine kinase and is an important regulator of signal transduction. Other PIM family genes have been found to be up-regulated in a variety of malignancies. In childhood AML, PIM3 was found to have variable expression across all subtypes, including CN-AML cases. Furthermore, the fusion transcript was detected at low levels in normal bone marrow.126 This is interesting because it has been reported that known oncoproteins, BCR-ABL from t(9;22) in CML and BCL2-IGH from t(14;20) in Follicular lymphoma, have also been detected at a low frequency in healthy individuals.127,128 This suggests that chromosomal aberrations can occur in normal hematopoietic cells and that the PIM3-SCO2 fusion may be involved in normal hematopoietic development and is highly deregulated in AML. Further work would be required to elucidate its potential role in myeloid malignancies.  The other recurrent fusion, OAZ1-DOT1L, is a novel observation in AML. However, DOT1L has been implicated in the development of KMT2A-rearranged leukemia.129 DOT1L is the only histone H3 lysine 79 (H3K79) methyltransferase in mammals130 and has been shown to cause aberrant H3K79 methylation of the HOX cluster genes that results in leukemic transformation.51 It is possible that the OAZ1-DOT1L fusion transcript could cause misdirection   66 of the DOT1L protein to methylate other genes, perhaps even the genes that were identified as differentially expressed. The downstream effects of the fusion transcript should be explored in more detail to determine the genes that are methylated at H3K79 and any consequences that they may have on AML pathogenesis. I observed a TFG-GPR128 fusion in one sample in P1 that was caused by a 111 kilobase tandem duplication.120 TFG-GPR128 has been reported in AML, although has not been associated with any obvious clinical phenotype. TFG (TRK-fused gene) is a known target of acquired chromosomal translocations including a TFG-ALK fusion in anaplastic large cell lymphoma.131 TFG is thought to have a role in signaling because it interacts with PTEN and the proteins in the NF-κB pathway. GPR128 is a member of the adhesion subfamily of G-protein coupled receptors.131 Additional analyses into the normal functions of TFG and GPR128 are required to understand the functional consequences the TFG-GPR128 fusion gene product.  In one sample in P1, I have identified an established fusion, KAT6A-CREBBP, from t(8;16), that is a rare occurring fusion in therapy-related AML.132 KAT6A is a histone acetyltransferase that acts as a co-activator for many transcription factors.133 Previous studies have shown that KAT6A is essential for self-renewal of hematopoietic stem cells.134 To date, six KAT6A fusions have been reported in leukemia with the following partners: CREBBP, EP300, NCOA2, NCOA3, ASXL2, and LEUX.121 These fusion genes have been shown to produce the same monocytic leukemia phenotype and that the fusion proteins enable the transformation of myeloid progenitors into leukemic stem cells.134 Other studies have shown that KAT6A cooperates with KMT2A to overexpress HOXA9, HOXA10, and MEIS1, which drive leukemogenesis.135 In addition, I identified a novel fusion, KAT6A-SORBS3, in one sample in P2 with 173 supporting reads spanning the breakpoint. I expect that the KAT6A-SORBS3 fusion   67 would have similar effects to the previously described KAT6A fusions, although mechanisms of leukemogenesis for KAT6A-related AML are still not fully understood. Furthermore, KAT6A fusions should be explored for their prognostic significance and whether it is beneficial to incorporate these fusions into current AML risk stratification.   Two other fusions including UBQLN1-HNRNPK and IQGAP1-ZNF774 were identified in samples from P3. The genes, UBQLN1 and HNRNPK, are part of the commonly deleted region in del(9q) AML.136 UBQLN1 is an ubiquitin-like protein that regulates protein degradation via the ubiquitin-proteasome system and HNRNPK is a nuclear ribonucleoprotein that specifically binds to poly(C) tracks and is thought to play a role in cell cycle progression. Studies have shown UBQLN1 is down-regulated in both del(9q) and CN-AML compared to CD34-positive controls.136 Studies have shown that the decreased activity or haploinsufficiency of critical genes may contribute to leukemogenesis, which may be the case for del(9q) and CN-AML cases with decreased expression of UBQLN1. There have been no reports regarding the genes involved in the IQGAP1-ZNF774 fusion in AML, although IQGAP1 has various roles in cellular functions such as cell-cell adhesion and transcription.137 This IQGAP1-ZNF774 fusion could disrupt the normal binding of IQGAP1 to target proteins resulting in deregulated transcription and may possibly increase the likelihood of metastasis.  In P2, I identified two fusions: RTF1-MAP1A and TTYH3-MAFK in two separate samples. The chromatin modifier, RTF1, is a component of the PAF1 complex (PAF1C), which is involved in histone modifications including methylation of genes at H3K4me3 to maintain stem cell pluripotency. PAF1C is also involved in transcriptional elongation and is required for the transcription of Hox and Wnt target genes. Furthermore, PAF1C has been shown to promote AML tumourigenesis by regulating the transcription of KMT2A.138 MAP1A encodes a   68 microtubule-associated protein. It is possible that the RTF1-MAP1A fusion may disrupt the PAF1 complex and cause aberrant expression of KMT2A or Hox and Wnt target genes, thus contributing to AML pathogenesis. A more detailed analysis of the specific function of RTF1 in the PAF1 complex is required to elucidate its direct role in leukemogenesis. The genes involved in the TTYH3-MAFK fusion include TTYH3, which functions as a chloride anion channel, and MAFK, a basic leucine zipper transcription factor. MAFK has been shown to impair erythroid maturation and altered expression of MAFK has been reported in the inv(16) AML subtype.139 If the TTYH3-MAFK fusion transcript is translated, then the MAFK gene would be under the control of the promoter of TTYH3, which may result in aberrant expression of MAFK, thus causing impaired erythroid differentiation and contributing to AML tumourigenesis. Overall, the PMP subgroups did not have any significant defining mutations, although PIM3-SCO2 and OAZ1-DOT1L were recurrently mutated in P1 and P3. In addition to structural genetic alterations, shifts in expression of specific genes can impact the prognosis of AML patients. For example, overexpression of EVI1 has been observed in all AML cases with inv(3)/t(3;3) as well as a subset of other AML subtypes, including CN-AML cases. High expression of EVI1 has been correlated with a poor outcome, specifically in the normal cytogenetics class.140 Other single gene expression predictors include high expression of the BAALC and ERG genes, which are both prognostic predictors of poor overall survival.140 Moreover, expression of specific genes are linked to treatment response. For example low levels of expression of MN1 has been connected to better response to ATRA treatment in elderly patients with APL.110,141 I observed that high expression of DOT1L correlated with poor survival in both the PMP and TCGA cohorts. However, after testing all the genes observed in the identified fusions, I concluded that Kaplan-Meier analysis of expression of single genes is not a   69 robust measure of survival. AML is a heterogeneous and complex disease that cannot be defined by the expression of one gene, thus I performed differential expression analysis to determine significant genes in each subgroup.  A significant number of differentially expressed genes were shared between P1 and T1 as well as P2 and T2, and P2 and T4. The gene MEST was up-regulated in P1 and T1. MEST is a hydrolase and a paternally expressed gene that is regulated by imprinting. Loss of imprinting of this gene has been linked to different cancers. For example, decreased MEST expression has been associated with glioblastoma,142 whereas increased MEST expression is associated with breast and lung cancer.143  High expression of the zinc-finger transcription factor, GATA1, in P1 and low expression in P3 and T2 was observed. GATA1 has a role in differentiation of hematopoietic cells and is usually only expressed in erythroid cells, megakaryocytes, eosinophils, and mast cells. However, recent studies have demonstrated that AML patients express GATA1 in myeloid progenitor cells and that high expression of GATA1 has a negative impact on survival.144 Other studies have shown that CEBPA double mutations are linked to the up-regulation of GATA1. I did not detect any CEBPA mutations in P1, thus GATA1 up-regulation must be caused through a different mechanism.  I detected high expression of MAFB in P2 and T2 and low expression in P1 and T1. MAFB overexpression is a well-established marker for an unfavourable prognosis of t(14;20) in multiple myeloma patients.145 The t(14;20) translocation gives rise to an IGH-MAFB fusion, although only aberrant MAFB overexpression is observed. In multiple myeloma, MAFB acts as an oncogene. However, in AML a deletion of the q arm of chromosome 20 has been reported and   70 loss of MAFB confers a proliferative advantage to myeloid cells.146 This implies that MAFB has tumour suppressive roles in AML.  High expression of VCAN in P2, T2, and T4 and low expression in P1 and T1 was detected. VCAN is a component of the extracellular matrix and is involved in modulation of cell adhesion, proliferation, and migration. High expression of VCAN has been implicated in breast, ovarian, and lung cancer but has only been described in an AML cell line.147   4.1 Conclusions and future directions In summary, the work presented here reveals the genetic complexity of CN-AML. My work emphasizes the heterogeneity of CN-AML and demonstrates that gene expression signatures are correlated with specific genetic abnormalities. I conclude that CN-AML can be further stratified into multiple subgroups and that these subgroups have differences in treatment responses and outcome. I found two recurrent fusions, PIM3-SCO2 and OAZ1-DOT1L, which were present in two of the subgroups and absent in the third. Further mutation analysis of a larger cohort of CN-AML patients with normal control samples would be required to determine whether these fusions are significant mutations that define CN-AML subgroups. I think that identification of novel fusions will help delineate subgroups of AML with different prognoses and yield new potential targets for AML therapy.    71 Bibliography 1 Redaelli, A., Stephens, J. M., Laskin, B. L., Pashos, C. L. & Botteman, M. F. The burden and outcomes associated with four leukemias: AML, ALL, CLL and CML. Expert review of anticancer therapy 3, 311-329, doi:10.1586/14737140.3.3.311 (2003). 2 Deschler, B. & Lubbert, M. Acute myeloid leukemia: epidemiology and etiology. Cancer 107, 2099-2107, doi:10.1002/cncr.22233 (2006). 3 Schoch, C. et al. The influence of age on prognosis of de novo acute myeloid leukemia differs according to cytogenetic subgroups. Haematologica 89, 1082-1090 (2004). 4 Dores, G. M., Devesa, S. S., Curtis, R. E., Linet, M. S. & Morton, L. M. Acute leukemia incidence and patient survival among children and adults in the United States, 2001-2007. Blood 119, 34-43, doi:10.1182/blood-2011-04-347872 (2012). 5 Patel, M. I., Ma, Y., Mitchell, B. S. & Rhoads, K. F. Age and genetics: how do prognostic factors at diagnosis explain disparities in acute myeloid leukemia? Am J Clin Oncol 38, 159-164, doi:10.1097/COC.0b013e31828d7536 (2015). 6 Pulte, D., Redaniel, M. T., Brenner, H. & Jeffreys, M. Changes in survival by ethnicity of patients with cancer between 1992-1996 and 2002-2006: is the discrepancy decreasing? Annals of oncology : official journal of the European Society for Medical Oncology / ESMO 23, 2428-2434, doi:10.1093/annonc/mds023 (2012). 7 Siegel, R., Ward, E., Brawley, O. & Jemal, A. Cancer statistics, 2011: the impact of eliminating socioeconomic and racial disparities on premature cancer deaths. CA: a cancer journal for clinicians 61, 212-236, doi:10.3322/caac.20121 (2011). 8 Alter, B. P. Fanconi anemia and the development of leukemia. Best practice & research. Clinical haematology 27, 214-221, doi:10.1016/j.beha.2014.10.002 (2014). 9 Khan, I., Malinge, S. & Crispino, J. Myeloid Leukemia in Down Syndrome.  16, 25-36, doi:10.1615/CritRevOncog.v16.i1-2.40 (2011). 10 Hahm, C. et al. Genomic aberrations of myeloproliferative and myelodysplastic/myeloproliferative neoplasms in chronic phase and during disease progression. International journal of laboratory hematology 37, 181-189, doi:10.1111/ijlh.12257 (2015). 11 Klimek, V. M. Recent advances in the management of therapy-related myelodysplastic syndromes and acute myeloid leukemia. Current opinion in hematology 20, 137-143, doi:10.1097/MOH.0b013e32835d82e6 (2013). 12 Kelly, L. M. & Gilliland, D. G. Genetics of myeloid leukemias. Annual review of genomics and human genetics 3, 179-198, doi:10.1146/annurev.genom.3.032802.115046 (2002). 13 Grimwade, D. The clinical significance of cytogenetic abnormalities in acute myeloid leukaemia. Best practice & research. Clinical haematology 14, 497-529, doi:10.1053/beha.2001.0152 (2001). 14 Gregory, T. K. et al. Molecular prognostic markers for adult acute myeloid leukemia with normal cytogenetics. Journal of hematology & oncology 2, 23, doi:10.1186/1756-8722-2-23 (2009). 15 Estey, E. & Dohner, H. Acute myeloid leukaemia. Lancet (London, England) 368, 1894-1907, doi:10.1016/s0140-6736(06)69780-8 (2006).   72 16 Mardis, E. R. et al. Recurring mutations found by sequencing an acute myeloid leukemia genome. N Engl J Med 361, 1058-1066, doi:10.1056/NEJMoa0903840 (2009). 17 Marcucci, G., Haferlach, T. & Dohner, H. Molecular genetics of adult acute myeloid leukemia: prognostic and therapeutic implications. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 29, 475-486, doi:10.1200/jco.2010.30.2554 (2011). 18 Knudson, A. G., Jr., Hethcote, H. W. & Brown, B. W. Mutation and childhood cancer: a probabilistic model for the incidence of retinoblastoma. Proceedings of the National Academy of Sciences of the United States of America 72, 5116-5120 (1975). 19 Reilly, J. T. Pathogenesis of acute myeloid leukaemia and inv(16)(p13;q22): a paradigm for understanding leukaemogenesis? British journal of haematology 128, 18-34, doi:10.1111/j.1365-2141.2004.05236.x (2005). 20 Welch, J. S. et al. The origin and evolution of mutations in acute myeloid leukemia. Cell 150, 264-278, doi:10.1016/j.cell.2012.06.023 (2012). 21 Corces-Zimmerman, M. R., Hong, W. J., Weissman, I. L., Medeiros, B. C. & Majeti, R. Preleukemic mutations in human acute myeloid leukemia affect epigenetic regulators and persist in remission. Proceedings of the National Academy of Sciences of the United States of America 111, 2548-2553, doi:10.1073/pnas.1324297111 (2014). 22 Verhaak, R. G. & Valk, P. J. Genes predictive of outcome and novel molecular classification schemes in adult acute myeloid leukemia. Cancer treatment and research 145, 67-83, doi:10.1007/978-0-387-69259-3_5 (2010). 23 Dohner, H., Weisdorf, D. J. & Bloomfield, C. D. Acute Myeloid Leukemia. N Engl J Med 373, 1136-1152, doi:10.1056/NEJMra1406184 (2015). 24 Harris, N. L. et al. World Health Organization classification of neoplastic diseases of the hematopoietic and lymphoid tissues: report of the Clinical Advisory Committee meeting-Airlie House, Virginia, November 1997. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 17, 3835-3849 (1999). 25 Campo, E. et al. The 2008 WHO classification of lymphoid neoplasms and beyond: evolving concepts and practical applications. Blood 117, 5019-5032, doi:10.1182/blood-2011-01-293050 (2011). 26 Dohner, H. et al. Diagnosis and management of acute myeloid leukemia in adults: recommendations from an international expert panel, on behalf of the European LeukemiaNet. Blood 115, 453-474, doi:10.1182/blood-2009-07-235358 (2010). 27 Craig, F. E. & Foon, K. A. Flow cytometric immunophenotyping for hematologic neoplasms. Blood 111, 3941-3967, doi:10.1182/blood-2007-11-120535 (2008). 28 Mrozek, K., Heerema, N. A. & Bloomfield, C. D. Cytogenetics in acute leukemia. Blood reviews 18, 115-136, doi:10.1016/s0268-960x(03)00040-7 (2004). 29 Frohling, S. et al. Comparison of cytogenetic and molecular cytogenetic detection of chromosome abnormalities in 240 consecutive adult patients with acute myeloid leukemia. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 20, 2480-2485 (2002). 30 Byrd, J. C. et al. Pretreatment cytogenetic abnormalities are predictive of induction success, cumulative incidence of relapse, and overall survival in adult patients with de novo acute myeloid leukemia: results from Cancer and Leukemia Group B (CALGB 8461). Blood 100, 4325-4336, doi:10.1182/blood-2002-03-0772 (2002).   73 31 Mrozek, K. et al. Comparison of cytogenetic and molecular genetic detection of t(8;21) and inv(16) in a prospective series of adults with de novo acute myeloid leukemia: a Cancer and Leukemia Group B Study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 19, 2482-2492 (2001). 32 Grimwade, D. et al. Refinement of cytogenetic classification in acute myeloid leukemia: determination of prognostic significance of rare recurring chromosomal abnormalities among 5876 younger adult patients treated in the United Kingdom Medical Research Council trials. Blood 116, 354-365, doi:10.1182/blood-2009-11-254441 (2010). 33 Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med 368, 2059-2074, doi:10.1056/NEJMoa1301689 (2013). 34 Patel, J. P. et al. Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. N Engl J Med 366, 1079-1089, doi:10.1056/NEJMoa1112304 (2012). 35 Pabst, T., Eyholzer, M., Fos, J. & Mueller, B. U. Heterogeneity within AML with CEBPA mutations; only CEBPA double mutations, but not single CEBPA mutations are associated with favourable prognosis. British journal of cancer 100, 1343-1346, doi:10.1038/sj.bjc.6604977 (2009). 36 Lowenberg, B., Griffin, J. D. & Tallman, M. S. Acute myeloid leukemia and acute promyelocytic leukemia. Hematology / the Education Program of the American Society of Hematology. American Society of Hematology. Education Program, 82-101 (2003). 37 Brunet, S. et al. Treatment of primary acute myeloid leukemia: results of a prospective multicenter trial including high-dose cytarabine or stem cell transplantation as post-remission strategy. Haematologica 89, 940-949 (2004). 38 Slovak, M. L. et al. Karyotypic analysis predicts outcome of preremission and postremission therapy in adult acute myeloid leukemia: a Southwest Oncology Group/Eastern Cooperative Oncology Group Study. Blood 96, 4075-4083 (2000). 39 Fung, H. C. et al. A long-term follow-up report on allogeneic stem cell transplantation for patients with primary refractory acute myelogenous leukemia: impact of cytogenetic characteristics on transplantation outcome. Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation 9, 766-771, doi:10.1016/j.bbmt.2003.08.004 (2003). 40 Rowley, J. D. Chromosomal translocations: revisited yet again. Blood 112, 2183-2189, doi:10.1182/blood-2008-04-097931 (2008). 41 Mrozek, K. & Bloomfield, C. D. Clinical significance of the most common chromosome translocations in adult acute myeloid leukemia. Journal of the National Cancer Institute. Monographs, 52-57, doi:10.1093/jncimonographs/lgn003 (2008). 42 Kuo, Y. H. et al. Cbf beta-SMMHC induces distinct abnormal myeloid progenitors able to develop acute myeloid leukemia. Cancer cell 9, 57-68, doi:10.1016/j.ccr.2005.12.014 (2006). 43 Meyer, S. C. & Levine, R. L. Translational implications of somatic genomics in acute myeloid leukaemia. The Lancet. Oncology 15, e382-394, doi:10.1016/s1470-2045(14)70008-7 (2014). 44 Kanamaru, A. et al. All-trans retinoic acid for the treatment of newly diagnosed acute promyelocytic leukemia. Japan Adult Leukemia Study Group. Blood 85, 1202-1206 (1995).   74 45 Tallman, M. S. et al. All-trans retinoic acid in acute promyelocytic leukemia: long-term outcome and prognostic factor analysis from the North American Intergroup protocol. Blood 100, 4298-4302, doi:10.1182/blood-2002-02-0632 (2002). 46 Castilla, L. H. et al. The fusion gene Cbfb-MYH11 blocks myeloid differentiation and predisposes mice to acute myelomonocytic leukaemia. Nature genetics 23, 144-146, doi:10.1038/13776 (1999). 47 Higuchi, M. et al. Expression of a conditional AML1-ETO oncogene bypasses embryonic lethality and establishes a murine model of human t(8;21) acute myeloid leukemia. Cancer cell 1, 63-74 (2002). 48 Grisolano, J. L., Wesselschmidt, R. L., Pelicci, P. G. & Ley, T. J. Altered myeloid development and acute leukemia in transgenic mice expressing PML-RAR alpha under control of cathepsin G regulatory sequences. Blood 89, 376-387 (1997). 49 Pollock, J. L., Westervelt, P., Walter, M. J., Lane, A. A. & Ley, T. J. Mouse models of acute promyelocytic leukemia. Current opinion in hematology 8, 206-211 (2001). 50 Daigle, S. R. et al. Potent inhibition of DOT1L as treatment of MLL-fusion leukemia. Blood 122, 1017-1025, doi:10.1182/blood-2013-04-497644 (2013). 51 Bernt, K. M. et al. MLL-rearranged leukemia is dependent on aberrant H3K79 methylation by DOT1L. Cancer cell 20, 66-78, doi:10.1016/j.ccr.2011.06.010 (2011). 52 Krauter, J. et al. Prognostic factors in adult patients up to 60 years old with acute myeloid leukemia and translocations of chromosome band 11q23: individual patient data-based meta-analysis of the German Acute Myeloid Leukemia Intergroup. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 27, 3000-3006, doi:10.1200/jco.2008.16.7981 (2009). 53 Mrozek, K. et al. Adult patients with de novo acute myeloid leukemia and t(9; 11)(p22; q23) have a superior outcome to patients with other translocations involving band 11q23: a cancer and leukemia group B study. Blood 90, 4532-4538 (1997). 54 Meyer, C. et al. New insights to the MLL recombinome of acute leukemias. Leukemia 23, 1490-1499, doi:10.1038/leu.2009.33 (2009). 55 Lindsley, R. C. et al. Acute myeloid leukemia ontogeny is defined by distinct somatic mutations. Blood 125, 1367-1376, doi:10.1182/blood-2014-11-610543 (2015). 56 Breems, D. A. et al. Monosomal karyotype in acute myeloid leukemia: a better indicator of poor prognosis than a complex karyotype. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 26, 4791-4797, doi:10.1200/jco.2008.16.0259 (2008). 57 Schlenk, R. F. et al. Mutations and treatment outcome in cytogenetically normal acute myeloid leukemia. N Engl J Med 358, 1909-1918, doi:10.1056/NEJMoa074306 (2008). 58 Falini, B. et al. Cytoplasmic nucleophosmin in acute myelogenous leukemia with a normal karyotype. N Engl J Med 352, 254-266, doi:10.1056/NEJMoa041974 (2005). 59 Bacher, U., Schnittger, S. & Haferlach, T. Molecular genetics in acute myeloid leukemia. Current opinion in oncology 22, 646-655, doi:10.1097/CCO.0b013e32833ed806 (2010). 60 Becker, H. et al. Favorable prognostic impact of NPM1 mutations in older patients with cytogenetically normal de novo acute myeloid leukemia and associated gene- and microRNA-expression signatures: a Cancer and Leukemia Group B study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 28, 596-604, doi:10.1200/jco.2009.25.1496 (2010).   75 61 Libura, M. et al. CEBPA copy number variations in normal karyotype acute myeloid leukemia: Possible role of breakpoint-associated microhomology and chromatin status in CEBPA mutagenesis. Blood cells, molecules & diseases 55, 284-292, doi:10.1016/j.bcmd.2015.07.002 (2015). 62 Reckzeh, K. & Cammenga, J. Molecular mechanisms underlying deregulation of C/EBPalpha in acute myeloid leukemia. International journal of hematology 91, 557-568, doi:10.1007/s12185-010-0573-1 (2010). 63 Taskesen, E. et al. Prognostic impact, concurrent genetic mutations, and gene expression features of AML with CEBPA mutations in a cohort of 1182 cytogenetically normal AML patients: further evidence for CEBPA double mutant AML as a distinctive disease entity. Blood 117, 2469-2475, doi:10.1182/blood-2010-09-307280 (2011). 64 Koschmieder, S., Halmos, B., Levantini, E. & Tenen, D. G. Dysregulation of the C/EBPalpha differentiation pathway in human cancer. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 27, 619-628, doi:10.1200/jco.2008.17.9812 (2009). 65 Wouters, B. J. et al. Double CEBPA mutations, but not single CEBPA mutations, define a subgroup of acute myeloid leukemia with a distinctive gene expression profile that is uniquely associated with a favorable outcome. Blood 113, 3088-3091, doi:10.1182/blood-2008-09-179895 (2009). 66 Nakao, M. et al. Internal tandem duplication of the flt3 gene found in acute myeloid leukemia. Leukemia 10, 1911-1918 (1996). 67 Bullinger, L. et al. An FLT3 gene-expression signature predicts clinical outcome in normal karyotype AML. Blood 111, 4490-4495, doi:10.1182/blood-2007-09-115055 (2008). 68 Schnittger, S. et al. Analysis of FLT3 length mutations in 1003 patients with acute myeloid leukemia: correlation to cytogenetics, FAB subtype, and prognosis in the AMLCG study and usefulness as a marker for the detection of minimal residual disease. Blood 100, 59-66 (2002). 69 Stirewalt, D. L. & Radich, J. P. The role of FLT3 in haematopoietic malignancies. Nature reviews. Cancer 3, 650-665, doi:10.1038/nrc1169 (2003). 70 Whitman, S. P. et al. Absence of the wild-type allele predicts poor prognosis in adult de novo acute myeloid leukemia with normal cytogenetics and the internal tandem duplication of FLT3: a cancer and leukemia group B study. Cancer research 61, 7233-7239 (2001). 71 Cortes, J. E. et al. Phase I study of quizartinib administered daily to patients with relapsed or refractory acute myeloid leukemia irrespective of FMS-like tyrosine kinase 3-internal tandem duplication status. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 31, 3681-3687, doi:10.1200/jco.2013.48.8783 (2013). 72 Stone, R. M. et al. Patients with acute myeloid leukemia and an activating mutation in FLT3 respond to a small-molecule FLT3 tyrosine kinase inhibitor, PKC412. Blood 105, 54-60, doi:10.1182/blood-2004-03-0891 (2005). 73 Zhang, W. et al. Mutant FLT3: a direct target of sorafenib in acute myelogenous leukemia. Journal of the National Cancer Institute 100, 184-198, doi:10.1093/jnci/djm328 (2008).   76 74 Bullinger, L. et al. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N Engl J Med 350, 1605-1616, doi:10.1056/NEJMoa031046 (2004). 75 Paschka, P. et al. Adverse prognostic significance of KIT mutations in adult acute myeloid leukemia with inv(16) and t(8;21): a Cancer and Leukemia Group B Study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 24, 3904-3911, doi:10.1200/jco.2006.06.9500 (2006). 76 Kindler, T. et al. Efficacy and safety of imatinib in adult patients with c-kit-positive acute myeloid leukemia. Blood 103, 3644-3654, doi:10.1182/blood-2003-06-2071 (2004). 77 Schubbert, S., Shannon, K. & Bollag, G. Hyperactive Ras in developmental disorders and cancer. Nature reviews. Cancer 7, 295-308, doi:10.1038/nrc2109 (2007). 78 Schnittger, S. et al. Screening for MLL tandem duplication in 387 unselected patients with AML identify a prognostically unfavorable subset of AML. Leukemia 14, 796-804 (2000). 79 Kihara, R. et al. Comprehensive analysis of genetic alterations and their prognostic impacts in adult acute myeloid leukemia patients. Leukemia 28, 1586-1595, doi:10.1038/leu.2014.55 (2014). 80 Taniuchi, I. & Littman, D. R. Epigenetic gene silencing by Runx proteins. Oncogene 23, 4341-4345, doi:10.1038/sj.onc.1207671 (2004). 81 Gaidzik, V. I. et al. RUNX1 mutations in acute myeloid leukemia: results from a comprehensive genetic and clinical analysis from the AML study group. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 29, 1364-1372, doi:10.1200/jco.2010.30.7926 (2011). 82 Mendler, J. H. et al. RUNX1 mutations are associated with poor outcome in younger and older patients with cytogenetically normal acute myeloid leukemia and with distinct gene and MicroRNA expression signatures. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 30, 3109-3118, doi:10.1200/jco.2011.40.6652 (2012). 83 Harada, H. et al. High incidence of somatic mutations in the AML1/RUNX1 gene in myelodysplastic syndrome and low blast percentage myeloid leukemia with myelodysplasia. Blood 103, 2316-2324, doi:10.1182/blood-2003-09-3074 (2004). 84 Michaud, J. et al. In vitro analyses of known and novel RUNX1/AML1 mutations in dominant familial platelet disorder with predisposition to acute myelogenous leukemia: implications for mechanisms of pathogenesis. Blood 99, 1364-1372 (2002). 85 Ley, T. J. et al. DNMT3A mutations in acute myeloid leukemia. N Engl J Med 363, 2424-2433, doi:10.1056/NEJMoa1005143 (2010). 86 Yan, X. J. et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nature genetics 43, 309-315, doi:10.1038/ng.788 (2011). 87 Marcucci, G. et al. IDH1 and IDH2 gene mutations identify novel molecular subsets within de novo cytogenetically normal acute myeloid leukemia: a Cancer and Leukemia Group B study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 28, 2348-2355, doi:10.1200/jco.2009.27.3730 (2010). 88 Paschka, P. et al. IDH1 and IDH2 mutations are frequent genetic alterations in acute myeloid leukemia and confer adverse prognosis in cytogenetically normal acute myeloid   77 leukemia with NPM1 mutation without FLT3 internal tandem duplication. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 28, 3636-3643, doi:10.1200/jco.2010.28.3762 (2010). 89 DiNardo, C. D. et al. Serum 2-hydroxyglutarate levels predict isocitrate dehydrogenase mutations and clinical outcome in acute myeloid leukemia. Blood 121, 4917-4924, doi:10.1182/blood-2013-03-493197 (2013). 90 Ward, P. S. et al. The common feature of leukemia-associated IDH1 and IDH2 mutations is a neomorphic enzyme activity converting alpha-ketoglutarate to 2-hydroxyglutarate. Cancer cell 17, 225-234, doi:10.1016/j.ccr.2010.01.020 (2010). 91 Xu, W. et al. Oncometabolite 2-hydroxyglutarate is a competitive inhibitor of alpha-ketoglutarate-dependent dioxygenases. Cancer cell 19, 17-30, doi:10.1016/j.ccr.2010.12.014 (2011). 92 Figueroa, M. E. et al. Leukemic IDH1 and IDH2 mutations result in a hypermethylation phenotype, disrupt TET2 function, and impair hematopoietic differentiation. Cancer cell 18, 553-567, doi:10.1016/j.ccr.2010.11.015 (2010). 93 Wang, F. et al. Targeted inhibition of mutant IDH2 in leukemia cells induces cellular differentiation. Science (New York, N.Y.) 340, 622-626, doi:10.1126/science.1234769 (2013). 94 Chou, W. C. et al. TET2 mutation is an unfavorable prognostic factor in acute myeloid leukemia patients with intermediate-risk cytogenetics. Blood 118, 3803-3810, doi:10.1182/blood-2011-02-339747 (2011). 95 Tefferi, A., Lim, K. H. & Levine, R. Mutation in TET2 in myeloid cancers. N Engl J Med 361, 1117; author reply 1117-1118, doi:10.1056/NEJMc091348 (2009). 96 Moran-Crusio, K. et al. Tet2 loss leads to increased hematopoietic stem cell self-renewal and myeloid transformation. Cancer cell 20, 11-24, doi:10.1016/j.ccr.2011.06.001 (2011). 97 Schnittger, S. et al. ASXL1 exon 12 mutations are frequent in AML with intermediate risk karyotype and are independently associated with an adverse outcome. Leukemia 27, 82-91, doi:10.1038/leu.2012.262 (2013). 98 Metzeler, K. H. et al. ASXL1 mutations identify a high-risk subgroup of older patients with primary cytogenetically normal AML within the ELN Favorable genetic category. Blood 118, 6920-6929, doi:10.1182/blood-2011-08-368225 (2011). 99 Abdel-Wahab, O. et al. ASXL1 mutations promote myeloid transformation through loss of PRC2-mediated gene repression. Cancer cell 22, 180-193, doi:10.1016/j.ccr.2012.06.032 (2012). 100 Abdel-Wahab, O. & Levine, R. L. Mutations in epigenetic modifiers in the pathogenesis and therapy of acute myeloid leukemia. Blood 121, 3563-3572, doi:10.1182/blood-2013-01-451781 (2013). 101 King-Underwood, L. & Pritchard-Jones, K. Wilms' tumor (WT1) gene mutations occur mainly in acute myeloid leukemia and may confer drug resistance. Blood 91, 2961-2968 (1998). 102 Van Vlierberghe, P. et al. PHF6 mutations in adult acute myeloid leukemia. Leukemia 25, 130-134, doi:10.1038/leu.2010.247 (2011).   78 103 Maciejewski, J. P. & Padgett, R. A. Defects in spliceosomal machinery: a new pathway of leukaemogenesis. British journal of haematology 158, 165-173, doi:10.1111/j.1365-2141.2012.09158.x (2012). 104 Golub, T. R. et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science (New York, N.Y.) 286, 531-537 (1999). 105 Valk, P. J. et al. Prognostically useful gene-expression profiles in acute myeloid leukemia. N Engl J Med 350, 1617-1628, doi:10.1056/NEJMoa040465 (2004). 106 Schoch, C. et al. Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proceedings of the National Academy of Sciences of the United States of America 99, 10008-10013, doi:10.1073/pnas.142103599 (2002). 107 Marcucci, G. et al. Prognostic significance of, and gene and microRNA expression signatures associated with, CEBPA mutations in cytogenetically normal acute myeloid leukemia with high-risk molecular features: a Cancer and Leukemia Group B Study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 26, 5078-5087, doi:10.1200/jco.2008.17.5554 (2008). 108 Mrozek, K. et al. Prognostic significance of the European LeukemiaNet standardized system for reporting cytogenetic and molecular alterations in adults with acute myeloid leukemia. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 30, 4515-4523, doi:10.1200/jco.2012.43.4738 (2012). 109 Groschel, S. et al. High EVI1 expression predicts outcome in younger adult patients with acute myeloid leukemia and is associated with distinct cytogenetic abnormalities. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 28, 2101-2107, doi:10.1200/jco.2009.26.0646 (2010). 110 Marcucci, G. et al. High expression levels of the ETS-related gene, ERG, predict adverse outcome and improve molecular risk-based classification of cytogenetically normal acute myeloid leukemia: a Cancer and Leukemia Group B Study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 25, 3337-3343, doi:10.1200/jco.2007.10.8720 (2007). 111 Haferlach, T. et al. Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: report from the International Microarray Innovations in Leukemia Study Group. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 28, 2529-2537, doi:10.1200/jco.2009.23.4732 (2010). 112 Patro, R., Mount, S. M. & Kingsford, C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature biotechnology 32, 462-464, doi:10.1038/nbt.2862 (2014). 113 Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics (Oxford, England) 26, 1572-1573, doi:10.1093/bioinformatics/btq170 (2010). 114 Robertson, G. et al. De novo assembly and analysis of RNA-seq data. Nature methods 7, 909-912, doi:10.1038/nmeth.1517 (2010). 115 Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Science signaling 6, pl1, doi:10.1126/scisignal.2004088 (2013).   79 116 Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer discovery 2, 401-404, doi:10.1158/2159-8290.CD-12-0095 (2012). 117 Budczies, J. et al. Cutoff Finder: a comprehensive and straightforward Web application enabling rapid biomarker cutoff optimization. PloS one 7, e51862, doi:10.1371/journal.pone.0051862 (2012). 118 Tusher, V. G., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America 98, 5116-5121, doi:10.1073/pnas.091062498 (2001). 119 Diaz-Uriarte, R. & Alvarez de Andres, S. Gene selection and classification of microarray data using random forest. BMC bioinformatics 7, 3, doi:10.1186/1471-2105-7-3 (2006). 120 Chase, A. et al. TFG, a target of chromosome translocations in lymphoma and soft tissue tumors, fuses to GPR128 in healthy individuals. Haematologica 95, 20-26, doi:10.3324/haematol.2009.011536 (2010). 121 Chinen, Y. et al. The leucine twenty homeobox (LEUTX) gene, which lacks a histone acetyltransferase domain, is fused to KAT6A in therapy-related acute myeloid leukemia with t(8;19)(p11;q13). Genes, chromosomes & cancer 53, 299-308, doi:10.1002/gcc.22140 (2014). 122 Lobry, C. et al. Notch pathway activation targets AML-initiating cell homeostasis and differentiation. The Journal of experimental medicine 210, 301-319, doi:10.1084/jem.20121484 (2013). 123 Verhaak, R. G. et al. Mutations in nucleophosmin (NPM1) in acute myeloid leukemia (AML): association with other gene abnormalities and previously established gene expression signatures and their favorable prognostic significance. Blood 106, 3747-3754, doi:10.1182/blood-2005-05-2168 (2005). 124 Lin, T. C. et al. CEBPA methylation as a prognostic biomarker in patients with de novo acute myeloid leukemia. Leukemia 25, 32-40, doi:10.1038/leu.2010.222 (2011). 125 Kiyoi, H. et al. Prognostic implication of FLT3 and N-RAS gene mutations in acute myeloid leukemia. Blood 93, 3074-3080 (1999). 126 Hsu, C. H. et al. Transcriptome Profiling of Pediatric Core Binding Factor AML. PloS one 10, e0138782, doi:10.1371/journal.pone.0138782 (2015). 127 Biernaux, C., Loos, M., Sels, A., Huez, G. & Stryckmans, P. Detection of major bcr-abl gene expression at a very low level in blood cells of some healthy individuals. Blood 86, 3118-3122 (1995). 128 Schmitt, C. et al. The bcl-2/IgH rearrangement in a population of 204 healthy individuals: occurrence, age and gender distribution, breakpoints, and detection method validity. Leukemia research 30, 745-750, doi:10.1016/j.leukres.2005.10.001 (2006). 129 McLean, C. M., Karemaker, I. D. & van Leeuwen, F. The emerging roles of DOT1L in leukemia and normal development. Leukemia 28, 2131-2138, doi:10.1038/leu.2014.169 (2014). 130 Singer, M. S. et al. Identification of high-copy disruptors of telomeric silencing in Saccharomyces cerevisiae. Genetics 150, 613-632 (1998). 131 Hernandez, L. et al. TRK-fused gene (TFG) is a new partner of ALK in anaplastic large cell lymphoma producing two structurally different TFG-ALK translocations. Blood 94, 3265-3268 (1999).   80 132 Block, A. W. et al. Rare recurring balanced chromosome abnormalities in therapy-related myelodysplastic syndromes and acute leukemia: report from an international workshop. Genes, chromosomes & cancer 33, 401-412 (2002). 133 Patnaik, M. M. et al. Chromosome 8p11.2 translocations: prevalence, FISH analysis for FGFR1 and MYST3, and clinicopathologic correlates in a consecutive cohort of 13 cases from a single institution. American journal of hematology 85, 238-242, doi:10.1002/ajh.21631 (2010). 134 Katsumoto, T. et al. MOZ is essential for maintenance of hematopoietic stem cells. Genes & development 20, 1321-1330, doi:10.1101/gad.1393106 (2006). 135 Paggetti, J. et al. Crosstalk between leukemia-associated proteins MOZ and MLL regulates HOX gene expression in human cord blood CD34+ cells. Oncogene 29, 5019-5031, doi:10.1038/onc.2010.254 (2010). 136 Sweetser, D. A. et al. Delineation of the minimal commonly deleted segment and identification of candidate tumor-suppressor genes in del(9q) acute myeloid leukemia. Genes, chromosomes & cancer 44, 279-291, doi:10.1002/gcc.20236 (2005). 137 Roy, M., Li, Z. & Sacks, D. B. IQGAP1 binds ERK2 and modulates its activity. The Journal of biological chemistry 279, 17329-17337, doi:10.1074/jbc.M308405200 (2004). 138 Muntean, A. G. et al. MLL fusion protein-driven AML is selectively inhibited by targeted disruption of the MLL-PAFc interaction. Blood 122, 1914-1922, doi:10.1182/blood-2013-02-486977 (2013). 139 Kannan, M. B., Solovieva, V. & Blank, V. The small MAF transcription factors MAFF, MAFG and MAFK: current knowledge and perspectives. Biochimica et biophysica acta 1823, 1841-1846, doi:10.1016/j.bbamcr.2012.06.012 (2012). 140 Langer, C. et al. High BAALC expression associates with other molecular prognostic markers, poor outcome, and a distinct gene-expression signature in cytogenetically normal patients younger than 60 years with acute myeloid leukemia: a Cancer and Leukemia Group B (CALGB) study. Blood 111, 5371-5379, doi:10.1182/blood-2007-11-124958 (2008). 141 Heuser, M. et al. MN1 overexpression induces acute myeloid leukemia in mice and predicts ATRA resistance in patients with AML. Blood 110, 1639-1647, doi:10.1182/blood-2007-03-080523 (2007). 142 Martinez, R. et al. A microarray-based DNA methylation study of glioblastoma multiforme. Epigenetics 4, 255-264 (2009). 143 Pedersen, I. S. et al. Promoter switch: a novel mechanism causing biallelic PEG1/MEST expression in invasive breast cancer. Human molecular genetics 11, 1449-1453 (2002). 144 Bai, Y. et al. Overexpression of DICER1 induced by the upregulation of GATA1 contributes to the proliferation and apoptosis of leukemia cells. International journal of oncology 42, 1317-1324, doi:10.3892/ijo.2013.1831 (2013). 145 Stralen, E. et al. MafB oncoprotein detected by immunohistochemistry as a highly sensitive and specific marker for the prognostic unfavorable t(14;20) (q32;q12) in multiple myeloma patients. Leukemia 23, 801-803, doi:10.1038/leu.2008.284 (2009). 146 Wang, P. W. et al. Human KRML (MAFB): cDNA cloning, genomic structure, and evaluation as a candidate tumor suppressor gene in myeloid leukemias. Genomics 59, 275-281, doi:10.1006/geno.1999.5884 (1999).   81 147 Miller, B. G. & Stamatoyannopoulos, J. A. Integrative meta-analysis of differential gene expression in acute myeloid leukemia. PloS one 5, e9466, doi:10.1371/journal.pone.0009466 (2010).   


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items