UBC Faculty Research and Publications

Structural analysis of the genome of breast cancer cell line ZR-75-30 identifies twelve expressed fusion… Schulte, Ina; Batty, Elizabeth M; Pole, Jessica C; Blood, Katherine A; Mo, Steven; Cooke, Susanna L; Ng, Charlotte; Howe, Kevin L; Chin, Suet-Feung; Brenton, James D; Caldas, Carlos; Howarth, Karen D; Edwards, Paul A Dec 22, 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2012_Article_4609.pdf [ 353.89kB ]
JSON: 52383-1.0223217.json
JSON-LD: 52383-1.0223217-ld.json
RDF/XML (Pretty): 52383-1.0223217-rdf.xml
RDF/JSON: 52383-1.0223217-rdf.json
Turtle: 52383-1.0223217-turtle.txt
N-Triples: 52383-1.0223217-rdf-ntriples.txt
Original Record: 52383-1.0223217-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessStructural analysis of the genome of breastcancer cell line ZR-75-30 identifies twelveexpressed fusion genesIna Schulte1†, Elizabeth M Batty1,3†, Jessica CM Pole1,4, Katherine A Blood1,5, Steven Mo1,6, Susanna L Cooke2,7,Charlotte Ng2,8, Kevin L Howe2,9, Suet-Feung Chin2, James D Brenton2, Carlos Caldas2, Karen D Howarth1*†and Paul AW Edwards1*AbstractBackground: It has recently emerged that common epithelial cancers such as breast cancers have fusion genes likethose in leukaemias. In a representative breast cancer cell line, ZR-75-30, we searched for fusion genes, by analysinggenome rearrangements.Results: We first analysed rearrangements of the ZR-75-30 genome, to around 10kb resolution, by molecularcytogenetic approaches, combining array painting and array CGH. We then compared this map with genomicjunctions determined by paired-end sequencing. Most of the breakpoints found by array painting and array CGHwere identified in the paired end sequencing—55% of the unamplified breakpoints and 97% of the amplifiedbreakpoints (as these are represented by more sequence reads). From this analysis we identified 9 expressed fusiongenes: APPBP2-PHF20L1, BCAS3-HOXB9, COL14A1-SKAP1, TAOK1-PCGF2, TIAM1-NRIP1, TIMM23-ARHGAP32, TRPS1-LASP1,USP32-CCDC49 and ZMYM4-OPRD1. We also determined the genomic junctions of a further three expressed fusiongenes that had been described by others, BCAS3-ERBB2, DDX5-DEPDC6/DEPTOR and PLEC1-ENPP2. Of this total of 12expressed fusion genes, 9 were in the coamplification. Due to the sensitivity of the technologies used, we estimatethese 12 fusion genes to be around two-thirds of the true total. Many of the fusions seem likely to be drivermutations. For example, PHF20L1, BCAS3, TAOK1, PCGF2, and TRPS1 are fused in other breast cancers. HOXB9 andPHF20L1 are members of gene families that are fused in other neoplasms. Several of the other genes are relevant tocancer—in addition to ERBB2, SKAP1 is an adaptor for Src, DEPTOR regulates the mTOR pathway and NRIP1 is anestrogen-receptor coregulator.Conclusions: This is the first structural analysis of a breast cancer genome that combines classical molecularcytogenetic approaches with sequencing. Paired-end sequencing was able to detect almost all breakpoints, wherethere was adequate read depth. It supports the view that gene breakage and gene fusion are important classes ofmutation in breast cancer, with a typical breast cancer expressing many fusion genes.Keywords: Breast cancer, Chromosome aberrations, Genomics, Fusion genes* Correspondence: kdh29@cam.ac.uk; pawe1@cam.ac.uk†Equal contributors1Hutchison/MRC Research Centre and Department of Pathology, Universityof Cambridge, Cambridge, UKFull list of author information is available at the end of the article© 2012 Schulte et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.Schulte et al. BMC Genomics 2012, 13:719http://www.biomedcentral.com/1471-2164/13/719BackgroundIn the last few years it has emerged that the commonepithelial cancers, such as carcinoma of breast, prostateand lung, have fusion genes like those long associatedwith leukaemias, lymphomas and sarcomas [1,2]. Thefirst to be discovered were in prostate cancer, whereabout half of all cases have the TMPRSS2-ERG fusiongene [3,4], and lung cancer, where around 5% of lungcancers have a fusion that activates the ALK tyrosinekinase, the EML4-ALK fusion [5]. However, these earlyexamples were found by essentially ‘one-off ’ methods,and did not answer the question of how many fusions atypical carcinoma expresses ([4,5] reviewed in [1]).In addition to creating fusion genes, the abundant gen-ome rearrangements in these cancers break many othergenes, and since breakage will almost always affect genefunction, rearrangement is likely to make a significantcontribution to inactivating genes [1,6].Recent technical developments now allow systematicsearches for genome rearrangements and hence fusiongenes [1]. ‘Array painting’, i.e. hybridization of individualchromosomes to a genomic microarray, allows manychromosome rearrangements (though not inversions) tobe analyzed to almost 1kb resolution [7-9]. ‘Paired-end-sequencing’ can be used to identify rearrangements byfinding breakpoint junctions: small genomic DNA frag-ments, typically 250-500bp, are sequenced from bothends and the paired sequence reads examined to seewhether they are the expected distance apart on thereference genome [10-12]. A variation is ‘mate-pairs’,where fragments of 3 to 5 kb are end-sequenced [11].Paired-end sequencing is also being applied to cDNA tofind fusion transcripts directly [13-15].To search for fusion genes in a representative breastcancer we chose the ZR-75-30 breast cancer cell line[16]. It has a typically rearranged karyotype, and a typ-ical high-copy-number coamplification of parts of chro-mosomes 8 and 17, particularly 8q24 and 17q11-24,forming five homogeneously staining regions (hsrs) [17].As often seen in breast cancer [18-22], this is a complexcoamplification of many small fragments of the genome.The amplification is relevant to the search for fusiongenes as some amplifications harbour fusion genes, per-haps formed early in cancer development and subse-quently amplified [10,20,21]. ZR-75-30 is also of interestas it is estrogen-receptor-positive (ER+) and has beenused as a model of an ER+ breast cancer that is insensi-tive to tamoxifen, in contrast to the sensitive line ZR-75-1 (which was from an unrelated patient) [16].To find fusion transcripts in ZR-75-30, we refined ourprevious 1-Mb resolution array-painting analysis of itskaryotype [8], using high-resolution array CGH data.Then we applied paired-end sequencing to identify re-arrangement junctions, particularly those in theamplification, which are preferentially sampled becausethey are present in multiple copies.Materials and methodsNomenclature, genome positions and transcriptsGenome positions are relative to GRCh37/hg19. Exonnumbering is from the Ensembl transcripts listed inAdditional file 1. Gene names follow HUGO Gene No-menclature and protein reference numbers are fromUniProtKB/Swiss-Prot database.Cells, DNA, RNAZR-75-30 cells were as used previously [8,17], derivedfrom a sample frozen in 1999 by Dr M.J. O’Hare, Ludwiginstitute for Cancer Research/UCL Breast Cancer La-boratory, London, U.K., who had obtained them fromthe American Type Culture Collection. We authenti-cated them by STR (short tandem repeat) analysis, andthey matched the ATCC database at all eight specifiedloci. Further evidence for their authenticity was that thefusion genes we described were common to other stocksof the line held by the ATCC and other laboratories (seeResults). The cells were maintained on 50:50 DMEM:F12medium (Invitrogen, Grand Island, NY, USA), 10 μg/mlinsulin, 10% foetal bovine serum. Non-cancer breast celllines, used to investigate expression in normal breast,were from the originators: HB4a is a line immortalizedfrom purified breast luminal epithelial cells [23] and theHMT3522 line was from fibrocystic (non-cancer) breast[24]. Other breast cancer cell lines were as described[17,25]. Genomic DNA, total RNA and random-primedcDNA were prepared as described [26].Array-CGH dataData were kindly provided by the Wellcome Trust SangerInstitute [27]. Breakpoint intervals were judged by eyeand confirmed by segmentation using the PICNICalgorithm [28].Paired-end sequencingZR-75-30 genomic DNA was sequenced in paired-endread mode using the Illumina GAIIx Genome Analyzer,and HiSeq2000 (Illumina, Great Chesterford, UK)[10,29]. Briefly, we sheared 5 μg of genomic DNA bysonication using a Bioruptor sonicator (Diagenode,Liège, Belgium). The fragmented DNA was end-repairedand a 3’ overhang was created, followed by ligationof Illumina paired-end adaptor oligonucleotides. Wesize-selected fragments at 400–600 bp by agarose gelelectrophoresis, and enriched for fragments with primerson either end by an 18-cycle PCR reaction. A total offive flowcell lanes were sequenced. 43 million, 36-bp,paired sequences (counting only unique reads with high-quality mapping) were obtained from one 500 bp librarySchulte et al. BMC Genomics 2012, 13:719 Page 2 of 11http://www.biomedcentral.com/1471-2164/13/719(median 504 bp, range 404 – 619 bp), equivalent to aver-age 1.7-fold coverage of single-copy breakpoints in thissubtetraploid genome.Two additional paired-end sequencing libraries weremade by the ‘mate-pair’ approach [11]: 3kb DNA frag-ments were circularized and the junction fragments iso-lated as a paired-end library, using reagent kits suppliedby Illumina. A single lane of each 3 kb library wassequenced, yielding about 1.25 million paired sequences,equivalent to 0.5 X coverage of single-copy breaks.Alignment and fusion predictionIn outline, analysis steps were: (i) alignment of sequen-cing reads, (ii) identifying aberrant pairs of read pairs,i.e. read pairs that aligned but not in the expected orien-tation or separation, (iii) clustering concordant aberrantreads to find candidate structural variants, and filteringof those candidates, (iv) prediction and verification of fu-sion genes.Raw sequences were obtained from Illumina’s standardimage analysis (FIRECREST) and base calling modules(BUSTARD). Reads were aligned to the reference gen-ome GRCh37/hg19 with BWA [30] to identify and re-move normal read pairs, which align to the genome withthe expected distance apart and orientation. Non-normal reads were then realigned using Novoalign(Novocraft Technologies, Selangor, Malaysia), a slowerbut more thorough aligner. Novoalign gives each read amapping quality score, a measure of the confidence ofmapping, and read pairs in which either read scoredbelow 30 were discarded. Library preparation involved aPCR amplification step which can result in duplicatecopies of the same read pair being sequenced: exact PCRduplicates were identified, and all but one copy removed,using Picard (http://picard.sourceforge.net/; [31]). Thisgave ’aberrant read pairs’, read pairs that aligned but notwith normal separation and orientation. These were thengrouped into clusters of read pairs that were consistentwith the same rearrangement junction: a minimum oftwo consistent reads were required. Additional filterswere then applied. Read pairs were checked for a pos-sible normal match to the reference genome using BLAT[32], since the alignment software sometimes aligns aread to an homologous sequence instead of its truematch, perhaps because of sequencing errors or poly-morphisms. Likely PCR duplicates that were offset byone or two bp were also discarded as likely to be PCRduplicates where a primer had lost one or two 3’ basepairs. Known normal human copy number variations[33] were discarded. Apparent variants were removed ifthey also appeared in a pool of paired end sequencesfrom 18 other unrelated samples from cancers, normaltissue or cell lines. Apparent intra-chromosomal rear-rangements spanning less than 10kb were also discarded,as most would be polymorphisms or outsize fragments.(Note that this does not remove all small rearrange-ments, such as small apparent insertions, e.g. the twoapparent junctions between chromosome 1 at 109.65Mb and a fragment of chr22 at 30.16Mb. Such ‘inser-tions’ may be deletions in the reference genome).Gene fusions and breakage were predicted from theresulting rearrangement breakpoints using the EnsemblApplication Programming Interface http://www.ensembl.org/info/docs/api/index.html to retrieve all the genesthat overlapped the breakpoints, or were adjacent tobreakpoints. To predict whether a fusion transcriptcould be formed we considered whether the 5’ or 3’ endof a gene would be retained, and whether, when the 5’end of a gene was retained, a ‘runthrough’ fusion couldbe formed by transcription into a downstream intactgene near the junction.Verification, Cloning and Sequencing of JunctionsSelected genomic junctions were verified by PCR usingprimers designed to flank the junction (Additional file 2and Additional file 3; Eurofins MWG Operon, Ebersberg,Germany), using DNA pooled from twenty normal indivi-duals as a control. To detect fusion transcripts, weamplified from cDNA using primers in flanking exonsof the expected fusions. Selected full-length transcriptswere then amplified using primers designed to includethe putative start and stop codons. Amplification was for35 cycles with an annealing temperature of 58°C usingHotMaster Taq DNA Polymerase (5 PRIME GmbH,Hamburg, Germany) or, for long-range PCR, ElongaseWEnzyme Mix (Invitrogen, Carlsbad, CA, USA) with 2mMMg2+. PCR products were sequenced in both directions,generally after cloning using a TOPO TA cloning kit(Invitrogen, Carlsbad, CA, USA). Primers used for clon-ing genomic and cDNA junctions are given in Additionalfile 4.This study did not require ethical approval.ResultsRefined cytogenetic map of the ZR-75-30 genomeWe first refined our previous analysis of the karyotypeof ZR-75-30 to ~10kb resolution. In our previous ana-lysis we used array painting, in which chromosomes areisolated by flow cytometry and hybridized individually togenomic arrays, to identify the components of eachchromosome [8]. This had given us a map of inter-chromosome rearrangements spanning more than about3 Mb. This analysis was refined by matching the unba-lanced breakpoints with array-comparative genomichybridization (array-CGH) on the SNP6 platform, fromBignell et al. [27] (Additional file 5). Some additionalcopy number steps, below the resolution of the arraypainting, were revealed in the unamplified regions,Schulte et al. BMC Genomics 2012, 13:719 Page 3 of 11http://www.biomedcentral.com/1471-2164/13/719notably additional breaks on chromosome 1 (which aremost likely additional internal rearrangement of the 1;21chromosome translocation named peak G in Howarthet al. [8]) (Additional file 5 and Additional file 6).We then overlaid a list of breakpoint junctionsobtained by paired-end sequencing (Additional file 5).These junctions had been filtered in various ways toreduce artefacts (see Methods). We additionallyrequired junctions to be identified by at least two inde-pendent read pairs in one library and either (i) to bepresent in more than one of the three librariessequenced or (ii) to correspond to a copy number step,A B**Chr 8 Chr 17 reads per binreads per binGenome positionGenome position01329 22370CFigure 1 Chromosome rearrangements observed in the genome of breast cancer cell line ZR-75-30. A. Genome-wide Circos plot ofstructural variation in the ZR-75-30 genome. An ideogram of a normal karyotype is shown around the outside. Copy number variation isrepresented by the blue line, shown inside the ideogram. Chromosome rearrangements are depicted with green (interchromosomal) and purple(intrachromosomal) lines. All 125 structural variants shown have been independently validated by PCR (red lines) or by matching to a copynumber step on a SNP6 array. B. Structural variation in the complex 8;17 amplicon. Colours as in A. C. Copy number variation in the 8;17amplicon of ZR-75-30. Ideograms of chromosomes 8 and 17 are shown, with regions containing amplified segments highlighted with a red boxand expanded below. Copy number changes, as measured by paired-end sequencing, illustrate the complexity of the amplification. An exampleset of ten confirmed rearrangement junctions are shown with green (interchromosomal) and purple (intrachromosomal) lines. Genome positionsare based on Hg19.Schulte et al. BMC Genomics 2012, 13:719 Page 4 of 11http://www.biomedcentral.com/1471-2164/13/719in the SNP6 array-CGH data [27] or the array paintingdata [8].This strategy yielded 318 apparent genomic junctions(Additional file 7), of which 112 were identified as likelyto explain a copy number step or match a junction inthe array painting data (Additional file 5). Of the 318genomic junctions, we identified 47 that were predictedto fuse genes, and tested for them by PCR on genomicDNA. 37/47 junctions were successfully amplified,among which 24/25 junctions were amplified that wereassociated with copy number steps, compared to 13/22that were not. 2 of these 13 junctions, not associatedwith detectable copy number change, were also ampli-fied from pooled normal genomic DNAs and thereforewere not considered further. The 125 genomic junctionsthat had been confirmed by an associated copy numberstep (89), or positive PCR product (13), or both (23), areillustrated in Figure 1 and Additional file 2. 62% of theseare intra-chromosomal rearrangements.We were able to identify breakpoint junctions corre-sponding to most of the previously-known breakpoints:about 55% of the breakpoints in unamplified regions,and 97% of the breakpoints (identified from copy num-ber steps) in the amplified regions of chromosome 8 and17, which, because they are present in many copies, gavemore reads in the sequencing (Additional file 5 andAdditional file 7 and Figure 1).The array-CGH showed that the coamplification ofchromosomes 8 and 17 was very complex (Figure 1C),too complex for all the fragments and copy numbersteps to be resolved. A reliable map of the amplicon can-not be assembled from these junctions alone, becausenot all junctions would have been detected, some maybe spurious, and there are usually multiple ways to as-semble a given set of junctions into a linear map [34,35].However, we show one possible assembly of 10 of thejunctions from chromosomes 8 and 17, to illustrate thecomplexity (Additional file 8). There was also a junction,verified by genomic PCR, that may well represent thejoin between the 8;17 amplification and flankingchromosome 14 material. It joins 84.97 Mb on chromo-some 14 to 102.54 Mb on chromosome 8. All four chro-mosomes that carry blocks of 8;17 coamplification alsocarry 14q (chromosome fractions C,D, F and L in ref.[8]), so this join may be the same on all of them.Gene fusionsWe found a total of 12 expressed gene fusions: we pre-dicted 9 from paired-end sequencing, and we confirmeda further 3 that were reported by Robinson DR et al. [15],also identifying the structural rearrangements that hadgenerated these additional fusions.Our nine fusion genes were found by searching junc-tions computationally to identify potentially fused genes,followed by manual inspection (Additional file 7). Junc-tions predicted to create fusions were verified by PCRon genomic DNA, as above, and the predicted tran-scripts were tested for by PCR from cDNA. Of thirtypredicted fusion transcripts, nine were successfully amp-lified (Table 1, Figure 2; for junction sequences seeAdditional file 1 and Additional file 3), including two ofthirteen predicted ‘run-through’ fusions, i.e. fusionsformed by breakage of the 5’ gene and transcription fromthis gene into an intact downstream gene (Figure 2).Some of the failures to amplify junctions and fusiontranscripts may have been technical failures, or due toerrors in mapping the paired sequences, or because therearrangements were more complex than the automatedanalysis revealed.We showed, by PCR, that all twelve of the fusionswere present in other available stocks of the ZR-75-30cell line, and not recent evolution in our cultures. Allthe fusion transcripts were present in the ZR-75-30stock used in Robinson et al. [15], tested using RNAkindly provided for the purpose by Prof Reis-Filho,Breakthrough Breast Cancer Research Centre, Instituteof Cancer Research, London, UK (passage 5 after receiptfrom ATCC), and in a separate stock from the Instituteof Cancer Research. Furthermore, the genomic junctionsthat create the twelve fusions were all present in a DNAsample newly purchased direct from ATCC.As found previously in breast cancer cell lines [12,36],a number of the genomic breakpoint junctions showedmicrohomology (four out of seven sequenced junctionshad 1–4 bp of microhomology), and one contained asmall fragment of sequence inserted from elsewhere inthe genome, termed a ‘genomic shard’ [37] (Additionalfile 3). This may be characteristic of a microhomology-mediated break-induced-replication (MMBIR) mechanism[38]. Our strategy may overlook some of these complexjunctions.Of the 12 fusions (Figure 2), nine were from the coam-plification of chromosomes 8 and 17. Four were ‘run-through fusions’, where transcription runs from a broken5’ gene into an intact downstream gene, with splicinginto the first splice acceptor, usually the second exon.Two fusions spanned two or more junctions (Figure 2Aand B).Fusion genes in the (8;17) ampliconAPPBP2-PHF20L1Paired-end reads suggested a complex rearrangementthat joined part of APPBP2 and PHF20L1 (Figure 2A;Additional file 7). We confirmed the presence of adouble junction at the genomic level by amplifying theexpected 10.4 kb APPBP2 insert by long-range PCR be-tween chromosome 8 and PHF20L1-intron 2 (Additionalfile 3).Schulte et al. BMC Genomics 2012, 13:719 Page 5 of 11http://www.biomedcentral.com/1471-2164/13/719A fusion transcript was detected that splices exon 9of APPBP2 in frame to exon 3 of PHF20L1 Isoform 2(ENST00000337920: the ENSEMBL transcripts fromwhich the exon numbering was taken are listed inAdditional file 1) (Figure 2A). Additionally, an alterna-tively spliced, out-of-frame fusion transcript was detected(Figure 2A).This is likely to be only part of the fusion transcript,since exon 5 is not a known transcription start site ofAPPBP2. The upstream genomic junction (Figure 2A)joins APPBP2 intron 4 to chromosome 8 at 109.67 Mb,but does not join it to any known gene—presumablythere is a further rearrangement junction upstream ofthis.COL14A1-SKAP1A full-length fusion transcript was amplified in whichCOL14A1 exon 2 was joined in frame to exon 5 of SKAP1(Table 1, Figure 2A, Additional file 1 and Additional file 7).Additional products were amplified that included cryp-tic exons of varying length from within intron 4 ofSKAP1 (Figure 2A), but in these transcripts SKAP1 isTable 1 Verified expressed gene fusions in the breast cancer cell line ZR-75-30 predicted from structural analysis5’ gene 3’ gene Chromosomes involved a Expression Inframe eFusion (F) orRunthroughfusion (R)5’ 3’APPBP2 PHF20L1 17 8 yes yes FCOL14A1 SKAP1 8 17 yes yes FTAOK1 PCGF2 17 17 yes b,d yes FUSP32 CCDC49 17 17 yes d no FBCAS3 HOXB9 17 17 yes d see text FTRPS1 LASP1 8 17 yes yes RERBB2 BCAS3 17 17 yes c no RDDX5 DEPDC 17 8 yes c yes RPLEC1 ENPP2 8 8 yes c yes FTIAM1 NRIP1 21 21 yes b yes FZMYM4 OPRD1 1 1 yes no FTIMM23 ARHGAP32 10 11 yes no RTMEM74 APPBP2 8 17 no FTRAPPC9 STARD3 8 17 no FSSH2 PLXDC1 17 17 no FTAOK1 CA10 17 17 no FHYLS1 TIMM23 11 10 no b FUSP32 RALYL 17 8 no FTMEM74 ACACA 8 17 no FNUDCD1 TAC4 8 17 no RTRAPPC9 HOXB6 8 17 no RSSH2 NFE2L1 17 17 no RTTC35 MKS1 8 17 no RTMEM71 CRYBA1 8 17 no RCA3 KIAA1429 8 8 no RGRHL2 NUDCD1 8 8 no RSUPT6H GPIHBP1 8 17 no RPGAP3 NOV 8 17 no RKIAA0100 LY6H 8 17 no RTG ERBB2 8 17 no RAll genomic junctions tested were positive by PCR; those marked c were not tested.a Precise chromosomal positions are given in Additional file 2 and Additional file 5 and the exon structure in Figure 2.b 5’ gene is untranslated sequence only.c Fusions not predicted by our analysis but detected by transcriptome sequencing by Robinson et al. (2011) and confirmed here by RT-PCR. Genomic breakpointswere detected in the present dataset on additional inspection—they had not met our stringent criteria or were complex rearrangements.d Fusions also reported by Robinson et al. (2011).e Predicted from annotations; not experimentally verified.Schulte et al. BMC Genomics 2012, 13:719 Page 6 of 11http://www.biomedcentral.com/1471-2164/13/719out of frame. Additionally, a splice variant lacking exon7 was found, introducing a stop codon in exon 8. Exon-7-skipped transcripts were observed in other breastcancer cell lines. It is not clear whether SKAP1 is upre-gulated by fusion as its expression was very variableamong normal and breast cancer cell lines. In T lym-phocytes SKAP1 is associated with ADAP, and ADAPmRNA was detected in ZR-75-30 and other cell lineswith relatively high SKAP1 expression.TAOK1-PCGF2/MEL18, USP32-CCDC49, BCAS3-HOXB9,TRPS1-LASP1The TAOK1-PCGF2, USP32-CCDC49, and BCAS3-HOXB9 fusion transcripts were detected by RT-PCR es-sentially as expected (Figure 2A), except that the splicedonor and acceptor sites of TAOK1 and PCGF2 in theirfusion transcript were both offset a few base pairs fromthe splice junctions reported by ENSEMBL (Additionalfile 1). These three fusions were also detected by tran-scriptome sequencing [15]. The TRPS1-LASP1 fusionjoins exon 3 of the transcription factor TRPS1, by tran-scription running through, in frame, to exon 2 of LASP1(Figure 2A).BCAS3-ERBB2, DDX5-DEPDC6/DEPTOR, PLEC1-ENPP2These three fusion transcripts were not discovered byour initial analysis. They were reported by Robinsonet al. [15], who detected 6 fusion transcripts in ZR-75-30by sequencing cDNA, three of which we had found(Table 1). The additional three fusion transcripts weconfirmed, by RT-PCR, and we also identified their gen-omic junctions in our sequencing data (Figure 2B). Wehad failed to discover two of these fusions because oflimitations of our fusion prediction: DDX5-DEPDC6/DEPTOR is a ‘run-through’ fusion (see above) that hadbeen obscured by other possible downstream fusionpartners; while PLEC1-ENPP2 was formed by a complexrearrangement apparently comprising two genomic junc-tions (Figure 2B). The BCAS3-ERBB2 breakpoint junc-tion was present only in one mate-pair library andtherefore had not met our stringent criteria.Fusion genes not in the amplicon, TIAM1-NRIP1, TIMM23-ARHGAP32, ZMYM4-OPRD1The TIAM1-NRIP1 fusion was predicted both frompaired-end sequencing and from combining the arraypainting with SNP6 array-CGH. It was probably formedby a simple 16-Mb interstitial deletion on chromosomeAPPBP2-PHF20L185Chr8:109.67 Chr17:58.546 7 8Chr17:58.53 Chr8:133.799 3AUGTAOK1-PCGF2/MEL18111Chr17:27.7 Chr17:36.83AUGUSP32-CCDC49 (out of frame)101Chr17:58.3 Chr17:36.932AUGBCAS3-HOXB9 (out of frame)21Chr17:58.9 Chr17:46.77AUG4 5471Chr8:116.63 Chr17:37.0332AUGTRPS1-LASP132COL14A1-SKAP11 2 125Chr8:121.1 Chr17:46.33AUG6 7 8AERBB2-BCAS3 (out of frame)PLEC1-ENPP2 1Chr8: 145.037002Chr8: 120.63501242AUG21Chr17: 37.880265 Chr17:58.74438216AUG514 15 3DDX5-DEPDC6 21Chr17: 62.501313 Chr8: 120.823595AUG3 9Chr8: 144.940489Chr8: 144.9514621917 1826BTIAM1-NRIP141Chr21:32.8 Chr21:16.42AUG3ZMYM4-OPRD1 (out of frame)31Chr1:35.8 Chr1:29.1226AUG252423226Chr10:51.61 Chr11:129.032TIMM23-ARHGAP32 (out of frame)1AUG225CFigure 2 Schematic representation of gene fusions and the expressed fusion transcripts in the breast cancer cell line ZR75-30 (not toscale). A. Fusions in the 8;17 amplicon. B. Structure of fusion transcripts detected by Robinson et al. [15]. C. Fusions at single copy breaks.Relevant exons are represented as numbered boxes, the transcription start site (AUG) is indicated with a black arrow and the breakpoint isindicated with a zig-zag line at the approximate chromosomal position (based on the UCSC Genome Browser, hg19). The (sequenced) expressedfusion transcripts and (where applicable) alternative splice products are shown below as black boxes joined by a dotted line. Exons depicted ingrey are expected to be expressed, but were not sequenced. For numbering of exons see Additional file 1.Schulte et al. BMC Genomics 2012, 13:719 Page 7 of 11http://www.biomedcentral.com/1471-2164/13/71921, since the presumed deleted region is at lower copynumber in array CGH and absent from the array paint-ing hybridisation of chromosome der(1)t(1;21)del(21)(peak G in [8]). A full-length transcript was amplified,with TIAM1 exon 1 fused to NRIP1 exon 2 (Table 1,Figure 2C, Additional file 1).The TIMM23-ARHGAP32 fusion is the result of atranslocation between chromosomes 10 and 11.TIMM23 is broken and transcription runs into the intactARHGAP32 gene, joining exon 6 of TIMM23 to exon 2of ARHGAP32 (Figure 2C).The ZMYM4-OPRD1 fusion is the result of an internalrearrangement of chromosome 1 (Table 1, Figure 2C,Additional file 7). Two transcripts were observed, bothjoining OPRD1 out of frame and leading to a stop codonshortly after the breakpoint. A major transcript wasdetected, fusing ZMYM4-exon 26 to OPRD1-exon 2 asexpected (Figure 2C), and a minor transcript, splicingZMYM4-exon 25 to OPRD1-exon 2 (Figure 2C).We were unable to clone and sequence the ZMYM4-OPRD1 genomic junction, but several junctions weredetected in this region of chromosome 1, suggesting thatthe rearrangement may be complex.DiscussionAnalysis of the ZR-75-30 genomeTogether, these data provide a gene-level analysis ofmost of the unamplified genome rearrangements in thiscell line, of more than 10 kb span. A few details are stillmissing, notably the centromeric breakpoints, and somebalanced breakpoints. Balanced breakpoints are invisibleto array-CGH and not all were sampled by the paired-end sequencing or fine-mapped in our previous arraypainting.Paired-end sequencing has various limitations, andcombining with other structural data as we have done isclearly valuable. Firstly, the method is not expected tofind all rearrangements, because it samples the genomeat random, and coverage is dependent on GC content[29]. Also, reads in repeats and segmental duplicationsgenerally cannot be used because they cannot bemapped to a unique match in the reference genome.Secondly, artefactual rearrangements can be created bycoligation of DNA fragments during preparation for se-quencing, and by errors in mapping reads.Sampling of junctions was surprisingly good: weaccounted for 97% of the copy number steps detected byarray-CGH in the amplicon, where the greater numberof reads across the junctions increased sensitivity. Thissuggests that, even using only 36 bp reads, rather fewjunctions would be undetectable because they areflanked by non-unique sequences. The lower samplingof single-copy junctions resulted in about 55% of thejunctions detected by array-CGH being detected bysequencing. Conversely, we identified almost twice asmany junctions in the amplicon as we expected from thecopy number steps. These were presumably a mixture ofartefacts and additional rearrangements that are notresolved by CGH, either because they involve small frag-ments or are balanced.Another limitation of paired end sequencing is that itdoes not show how junctions are joined together, e.g.whether two apparently-neighbouring junctions are onthe same chromosome or not, nor whether the regionbetween is interrupted by further junctions [35]. This isillustrated by two of the fusion genes, APPBP2-PHF20L1and PLEC1-ENPP2, both transcribed across more thanone genomic junction.ZR-75-30 expresses at least 12 fusion transcriptsBy combining molecular cytogenetic approaches—high-resolution array-CGH and array painting—with paired-endsequencing, we have catalogued genome rearrangementsof this cell line and found 9 expressed fusion transcripts.We combined this with 3 additional fusion transcriptsfound by sequencing cDNA [15], for which we have iden-tified the genomic junctions.Nine of 12 fusions in ZR-75-30 are in the complexcoamplification of chromosomes 8 and 17, the fusionsAPPBP2-PHF20L1, BCAS3-HOXB9, TAOK1-PCGF2 andDDX5-DEPDC6/DEPTOR being most amplified. Suchcomplex coamplifications are common [19] and prob-ably give the ‘firestorm’ pattern of multiple small ampli-fied fragments seen in array-CGH [22,39]. The MCF7cell line has a similar coamplification involving chromo-somes 1, 3, 17, and 20 and containing highly-amplifiedgene fusions [6].Of these 12 fusion genes, seven were formed by intra-chromosomal rearrangements, confirming that morefusion genes are formed by intra-chromosomal rearrange-ment than by chromosome translocation [1]. This mightbe expected if rearrangements arise at replication bubbles[36] rather than random breakage and rejoining.How many expressed fusion genes are there in breastcancers?Extrapolating from our work and Robinson DR et al.[15], ZR-75-30 may have around 18 expressed fusiongenes and breast cancers in general—not cell lines—mayexpress on average around 10.In ZR-75-30, using structural analysis, we found halfof the six expressed fusions detected by Robinson DRet al. [15], while, using cDNA sequencing, they foundthree of the nine we detected—both figures suggest thetrue total might be around 18. This is consistent with re-cent, probably incomplete, figures from other cell lines:20 expressed fusions have been verified in MCF7, withSchulte et al. BMC Genomics 2012, 13:719 Page 8 of 11http://www.biomedcentral.com/1471-2164/13/719several more predicted computationally [6,13,15,40]; 43have been found in BT474 and 13 in SKBR3 [13].Breast cancers—as opposed to cell lines—appear tohave almost as many fusions. Robinson DR et al. [15]identified an average of 4.2 expressed fusions per case(0 to 20 in 38 breast tumours), compared to 5.5 per casein cell lines. Their sensitivity seems to have been around40%, comparing their findings with ours and with thepublished cell line data above. This gives a best guessthat breast tumours will on average express 10 fusions[41], with wide variation from cases to case, as expectedfrom their variable levels of rearrangement [42].Are these passenger or driver mutations?The fusions found here argue strongly that some at leastare selected, i.e. ‘driver’ mutations, rather than randomincidental ‘passenger’ mutations [43]. As detailed in thesupplementary discussion in Additional file 9, several ofthe genes involved have already been found to be fusedin other breast cancer cell lines—PHF20L1 and BCAS3[6,13,15,21,44] —or in other tumours—BCAS3 again,and PCGF2, TAOK1 and TRPS1 [45,46]. Others aremembers of families that include multiple fused genes—the collagens, HOX and PHF families. Several of thefusions resemble known recurrent gene fusions in gen-eral functional terms [1,2]: for example, fusions ofHOXB9, PCGF2, PHF20L1, and NRIP1 would be typicalof the many known fusions that control gene expressiondirectly or via chromatin structure, and all could encodefunctional domains of the proteins. Several of the genesinvolved are also in signalling pathways relevant tobreast cancer: ERBB2, NRIP1 and BCAS3 are involvedin estrogen receptor function and APPBP2 with andro-gen receptor; while TAOK1 and SKAP1 are involved inMAPK signalling and DEPDC6/DEPTOR regulatesmTOR signalling.Several of the fused genes are also recurrently brokenin a substantial proportion of breast cancers, as judgedby copy number steps in array-CGH of 1000 breasttumours [47]: around 10% have breaks in ERBB2,BCAS3 and SKAP1, while COL14A1, TIAM1, USP32,TAOK1 are broken in around 4%.Some of the fusions, and particularly those notexpressed, may simply inactivate a copy of the participat-ing gene(s) [1,6]. For example, our fusions of TIAM1and TAOK1 inactivate one copy of these genes. Somegenes, e.g. BCAS3, that are fused in more than one can-cer cell line retain different, non-overlapping parts of thegene in different cases, suggesting the common theme isinactivation. In some cases fusion of a gene may sup-press its expression, perhaps by destabilising the mRNA:among the predicted fusion genes for which we couldnot detect a transcript, unfused copies of some of the 5’participating genes were transcribed—for example SSH2,NUDCD1 and TRAPPC9 (Table 1; Additional file 7).ConclusionFusion genes in ZR-75-30 and cancers in generalWe have brought the total of fusion genes expressed byZR-75-30 to 12, and there are good reasons to think thefinal total will be around 18. We have argued from thisand other data that carcinomas not only have fusiongenes analogous to those found in leukaemias [1,4], buteach case may have many of them, and many will befunctionally significant. This suggests a picture of neo-plasia in which all neoplasms have a mixture of mutationtypes—point mutations, deletions, fusion genes, etc. Ra-ther than leukaemias being driven by fusion genes whilecarcinomas were driven by point mutations and dele-tions, the main difference between carcinomas and leu-kaemias may simply be that carcinomas have moremutations than leukaemias.Additional filesAdditional file 1: Junction and fusion transcript sequences.Additional file 2: Confirmed structural variants in ZR-75-30.Additional file 3: Genomic junction sequences.Additional file 4: Primers for amplifying genomic or transcriptjunctions and full-length fusion genes.Additional file 5: A comparison of breakpoints determined by snp6and solexa sequencing.Additional file 6: A comparison of breakpoints by 1Mb arraypainting and solexa sequencing data.Additional file 7: Structural rearrangements determined by paired-end sequencing.Additional file 8: One possible assembly of ten junctions in the8;17 amplicon of ZR-75-30.Additional file 9: Supplementary discussion: Discussion ofindividual fusion genes.AbbreviationsAPPBP2: Amyloid beta precursor protein (cytoplasmic tail) binding protein 2;PHF20L1: PHD finger protein 20-like 1; BCAS3: Breast carcinoma amplifiedsequence 3; HOXB9: Homeobox B9; COL14A1: Collagen, type XIV, alpha 1;SKAP1: src kinase associated phosphoprotein 1; TAOK1: TAO kinase 1;PCGF2: Polycomb group ring finger 2; TIAM1: T-cell lymphoma invasion andmetastasis 1; NRIP1 (RIP140): Nuclear receptor interacting protein 1;TIMM23: Translocase of inner mitochondrial membrane 23 homolog (yeast);ARHGAP32: Rho GTPase activating protein 32; TRPS1: Trichorhinophalangealsyndrome I; LASP1: LIM and SH3 protein 1; USP32: Ubiquitin specificpeptidase 32; CCDC49: (CWC25) spliceosome-associated protein homolog (S.cerevisiae); ZMYM4: Zinc finger MYM-type protein 4; OPRD1: Opioid receptor,delta 1; ERBB2: v-erb-b2 erythroblastic leukemia viral oncogene homolog 2;DDX5: DEAD (Asp-Glu-Ala-Asp) box polypeptide 5; DEPDC6/DEPTOR: DEPdomain containing MTOR-interacting protein; PLEC1: Plectin;ENPP2: Ectonucleotide pyrophosphatase/phosphodiesterase 2;TMPRSS2: Transmembrane protease, serine 2; ERG: v-ets erythroblastosis virusE26 oncogene homolog (avian); ALK: Anaplastic lymphoma receptor tyrosinekinase; EML4: Echinoderm microtubule associated protein like 4; ER+: Estrogen-receptor positive; Array-CGH: Array-comparative genomichybridization; MAPK: Mitogen-activated protein kinase; SSH2: Slingshothomolog 2; NUDCD1: NudC domain containing 1; TRAPPC9: Traffickingprotein particle complex 9.Schulte et al. BMC Genomics 2012, 13:719 Page 9 of 11http://www.biomedcentral.com/1471-2164/13/719Competing interestsThe authors declare that they have no competing interests.Authors’ contributionsIS, KDH, CC and PAWE conceived the study. IS, KDH, JCMP, KAB, SM and SFCcarried out experiments. EMB and PAWE, with KDH, SLC, CN, KH and JDBanalysed the sequencing data. IS, KDH and PAWE wrote the manuscript. Allauthors read and approved the final manuscript.AcknowledgementsWe thank members of the Edwards lab—Scott Newman, Katherine Bird,Susanne Flach, Claire Pike and Jamie Weaver—for help with techniques, andthe core Bioinformatics and Genomics services of the Cancer Research UKCambridge Research Institute for sequencing and data processing. We thankProfessor Reis-Filho and Paul Wilkerson for RNA. This work was supported bya Deutscher Akademischer Austausch Dienst fellowship to I.S., Breast CancerCampaign and Cancer Research UK.Author details1Hutchison/MRC Research Centre and Department of Pathology, Universityof Cambridge, Cambridge, UK. 2Cancer Research UK Cambridge ResearchInstitute and Department of Oncology, University of Cambridge, Li Ka-ShingCentre, Cambridge, UK. 3Current addresses: Department of Statistics,University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK. 4Currentaddresses: BlueGnome Ltd, CPC4, Capital Park, Fulbourn, Cambridge CB215XE, UK. 5Current addresses: Department of Medical Genetics, University ofBritish Columbia, Vancouver, BC V6H 2N1, Canada. 6Current addresses:Institute of Biomedical Engineering, Department of Engineering Science,University of Oxford, Oxford OX3 7DQ, UK. 7Current addresses: CancerGenome Project, Wellcome Trust Sanger Institute, Hinxton, CambridgeshireCB10 1SA, UK. 8Current addresses: Breakthrough Breast Cancer ResearchCentre, Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK.9Current addresses: European Bioinformatics Institute, Hinxton,Cambridgeshire CB10 1SD, UK.Received: 27 November 2012 Accepted: 14 December 2012Published: 22 December 2012References1. Edwards PAW: Fusion genes and chromosome translocations in thecommon epithelial cancers. J Pathol 2010, 220:244–254.2. Mitelman F, Johansson B, Mertens F: The impact of translocations andgene fusions on cancer causation. Nat Rev Cancer 2007, 7:233–245.3. Mehra R, Tomlins SA, Shen R, Nadeem O, Wang L, Wei JT, Pienta KJ, GhoshD, Rubin MA, Chinnaiyan AM, Shah RB: Comprehensive assessment ofTMPRSS2 and ETS family gene aberrations in clinically localized prostatecancer. Modern pathology: an official journal of the United States andCanadian Academy of Pathology, Inc 2007, 20:538–544.4. Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun X-W,Varambally S, Cao X, Tchinda J, Kuefer R, et al: Recurrent fusion ofTMPRSS2 and ETS transcription factor genes in prostate cancer. Science(New York, NY) 2005, 310:644–648.5. Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S,Fujiwara S-i, Watanabe H, Kurashina K, Hatanaka H, et al: dentification ofthe transforming EML4-ALK fusion gene in non-small-cell lung cancer.Nature 2007, 448:561–566.6. Hampton OA, Den Hollander P, Miller CA, Delgado DA, Li J, Coarfa C,Harris RA, Richards S, Scherer SE, Muzny DM, et al: A sequence-level mapof chromosomal breakpoints in the MCF-7 breast cancer cell line yieldsinsights into the evolution of a cancer genome. Genome Res 2009,19:167–177.7. Fiegler H, Gribble SM, Burford DC, Carr P, Prigmore E, Porter KM, Clegg S,Crolla JA, Dennis NR, Jacobs P, Carter NP: Array painting: a method for therapid analysis of aberrant chromosomes using DNA microarrays. J MedGenet 2003, 40:664–670.8. Howarth KD, Blood KA, Ng BL, Beavis JC, Chua Y, Cooke SL, Raby S,Ichimura K, Collins VP, Carter NP, Edwards PAW: Array painting reveals ahigh frequency of balanced translocations in breast cancer cell lines thatbreak in cancer-relevant genes. Oncogene 2008, 27:3345–3359.9. Veltman JA, Fridlyand J, Pejavar S, Olshen AB, Korkola JE, DeVries S, Carroll P,Kuo W-L, Pinkel D, Albertson D, et al: Array-based comparative genomichybridization for genome-wide screening of DNA copy number inbladder tumors. Cancer Res 2003, 63:2872–2880.10. Campbell PJ, Stephens PJ, Pleasance ED, O'Meara S, Li H, Santarius T,Stebbings LA, Leroy C, Edkins S, Hardy C, et al: Identification of somaticallyacquired rearrangements in cancer using genome-wide massivelyparallel paired-end sequencing. Nat Genet 2008, 40:722–729.11. Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, Simons JF, Kim PM,Palejev D, Carriero NJ, Du L, et al: Paired-end mapping reveals extensivestructural variation in the human genome. Science (New York, NY) 2007,318:420–426.12. Stephens PJ, McBride DJ, Lin M-L, Varela I, Pleasance ED, Simpson JT,Stebbings LA, Leroy C, Edkins S, Mudie LJ, et al: Complex landscapes ofsomatic rearrangement in human breast cancer genomes. Nature 2009,462:1005–1010.13. Kim D, Salzberg SL: TopHat-Fusion: an algorithm for discovery of novelfusion transcripts. Genome Biol 2011, 12:R72.14. McPherson A, Hormozdiari F, Zayed A, Giuliany R, Ha G, Sun MGF, Griffith M,Heravi Moussavi A, Senz J, Melnyk N, et al: DeFuse: an algorithm for genefusion discovery in tumor RNA-Seq data. PLoS Comput Biol 2011,7:e1001138.15. Robinson DR, Kalyana-Sundaram S, Wu Y-M, Shankar S, Cao X, Ateeq B,Asangani IA, Iyer M, Maher CA, Grasso CS, et al: Functionally recurrentrearrangements of the MAST kinase and Notch gene families in breastcancer. Nature medicine 2011, 17:1646–1651.16. Engel LW, Young NA, Tralka TS, Lippman ME, O'Brien SJ, Joyce MJ:Establishment and characterization of three new continuous cell linesderived from human breast carcinomas. Cancer research 1978,38:3352–3364.17. Davidson JM, Gorringe KL, Chin SF, Orsetti B, Besret C, Courtay-Cahen C,Roberts I, Theillet C, Caldas C, Edwards PA: Molecular cytogenetic analysisof breast cancer cell lines. British journal of cancer 2000, 83:1309–1317.18. Guan X-Y: Meltzer PS, Dalton WS, Trent JM: Identification of cryptic sitesof DNA sequence amplification in human breast cancer by chromosomemicrodissection. Nature genetics 1994, 8:155–161.19. Paterson AL, Pole JCM, Blood KA, Garcia MJ, Cooke SL, Teschendorff AE,Wang Y, Chin S-F, Ylstra B, Caldas C, Edwards PAW: Co-amplification of8p12 and 11q13 in breast cancers is not the result of a single genomicevent. Genes, chromosomes & cancer 2007, 46:427–439.20. Volik S, Zhao S, Chin K, Brebner JH, Herndon DR, Tao Q, Kowbel D,Huang G, Lapuk A, Kuo W-L, et al: End-sequence profiling:sequence-based analysis of aberrant genomes. Proceedings of the NationalAcademy of Sciences of the United States of America 2003, 100:7696–7701.21. Bärlund M, Monni O, Weaver JD, Kauraniemi P, Sauter G, Heiskanen M,Kallioniemi O-P, Kallioniemi A: Cloning of BCAS3 (17q23) and BCAS4(20q13) genes that undergo amplification, overexpression, and fusion inbreast cancer. Genes, chromosomes & cancer 2002, 35:311–317.22. Russnes HG, Vollan HK, Lingjaerde OC, Krasnitz A, Lundin P, Naume B,Sorlie T, Borgen E, Rye IH, Langerod A, et al: Genomic architecturecharacterizes tumor progression paths and fate in breast cancerpatients. Sci Transl Med 2010, 2:38ra47.23. Stamps AC, Davies SC, Burman J, O'Hare MJ: Analysis of proviralintegration in human mammary epithelial cell lines immortalized byretroviral infection with a temperature-sensitive SV40 T-antigenconstruct. International journal of cancer Journal international du cancer1994, 57:865–874.24. Briand P, Petersen OW, van Deurs B: A new diploid nontumorigenichuman breast epithelial cell line isolated and propagated in chemicallydefined medium. In vitro cellular & developmental biology: journal of theTissue Culture Association 1987, 23:181–188.25. Pole JCM, Courtay-Cahen C, Garcia MJ, Blood KA, Cooke SL, Alsop AE,Tse DML, Caldas C, Edwards PAW: High-resolution analysis ofchromosome rearrangements on 8p in breast, colon and pancreaticcancer reveals a complex pattern of loss, gain and translocation.Oncogene 2006, 25:5693–5706.26. Chua YL, Ito Y, Pole JC, Newman S, Chin SF, Stein RC, Ellis IO, Caldas C,O'Hare MJ, Murrell A, Edwards PA: The NRG1 gene is frequently silencedby methylation in breast cancers and is a strong candidate for the 8ptumour suppressor gene. Oncogene 2009, 28:4041–4052.27. Bignell GR, Greenman CD, Davies H, Butler AP, Edkins S, Andrews JM,Buck G, Chen L, Beare D, Latimer C, et al: Signatures of mutation andselection in the cancer genome. Nature 2010, 463:893–898.Schulte et al. BMC Genomics 2012, 13:719 Page 10 of 11http://www.biomedcentral.com/1471-2164/13/71928. Greenman CD, Bignell G, Butler A, Edkins S, Hinton J, Beare D, Swamy S,Santarius T, Chen L, Widaa S, et al: PICNIC: an algorithm to predictabsolute allelic copy number variation with microarray cancer data.Biostatistics (Oxford, England) 2010, 11:164–175.29. Quail MA, Swerdlow H, Turner DJ: Improved protocols for the illuminagenome analyzer sequencing system. In Current protocols in humangenetics. Edited by Haines JL. US: Wiley; 2009. Chapter 18:Unit 18.12.30. Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 2010, 26:589–595.31. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,Abecasis G, Durbin R: Genome project data processing S: the sequencealignment/Map format and SAMtools. Bioinformatics (Oxford, England)2009, 25:2078–2079.32. Kent WJ: BLAT–the BLAST-like alignment tool. Genome Res 2002,12:656–664.33. Conrad DF, Bird C, Blackburne B, Lindsay S, Mamanova L, Lee C, Turner DJ,Hurles ME: Mutation spectrum revealed by breakpoint sequencing ofhuman germline CNVs. Nat Genet 2010, 42:385–391.34. Greenman CD, Pleasance ED, Newman S, Yang F, Fu B, Nik-Zainal S, JonesD, Lau KW, Carter N, Edwards PAW, et al: Estimation of rearrangementphylogeny for cancer genomes. Genome Res 2012, 22:346–361.35. Pole JCM, McCaughan F, Newman S, Howarth KD, Dear PH, Edwards PAW:Single-molecule analysis of genome rearrangements in cancer.Nucleic Acids Res 2011, 39:e85.36. Howarth KD, Pole JC, Beavis JC, Batty EM, Newman S, Bignell GR, EdwardsPA: Large duplications at reciprocal translocation breakpoints that mightbe the counterpart of large deletions and could arise from stalledreplication bubbles. Genome Research 2011, 21(40):524–534.37. Bignell GR, Santarius T, Pole JCM, Butler AP, Perry J, Pleasance E,Greenman C, Menzies A, Taylor S, Edkins S, et al: Architectures of somaticgenomic rearrangement in human cancer amplicons at sequence-levelresolution. Genome Res 2007, 17:1296–1303.38. Hastings PJ, Ira G, Lupski JR: A microhomology-mediated break-inducedreplication model for the origin of human copy number variation.PLoS Genet 2009, 5:e1000327.39. Hicks J, Krasnitz A, Lakshmi B, Navin NE, Riggs M, Leibu E, Esposito D,Alexander J, Troge J, Grubor V, et al: Novel patterns of genomerearrangement and their association with survival in breast cancer.Genome Res 2006, 16:1465–1479.40. Hampton OA, Koriabine M, Miller CA, Coarfa C, Li J, Den Hollander P,Schoenherr C, Carbone L, Nefedov M, Ten Hallers BF, et al: Long-rangemassively parallel mate pair sequencing detects distinct mutations andsimilar patterns of structural mutability in two breast cancer cell lines.Cancer Genet 2011, 204:447–457.41. Edwards PA, Howarth KD: Are breast cancers driven by fusion genes?Breast cancer research: BCR 2012, 14:303.42. Fridlyand J, Snijders AM, Ylstra B, Li H, Olshen A, Segraves R, Dairkee S,Tokuyasu T, Ljung BM, Jain AN, et al: Breast tumor copy numberaberration phenotypes and genomic instability. BMC Cancer 2006, 6:96.43. Stratton MR, Campbell PJ, Futreal PA: The cancer genome. Nature 2009,458:719–724.44. Zhao Q, Caballero OL, Levy S, Stevenson BJ, Iseli C, de Souza SJ, Galante PA,Busam D, Leversha MA, Chadalavada K, et al: Transcriptome-guidedcharacterization of genomic rearrangements in a breast cancer cell line.Proc Natl Acad Sci USA 2009, 106:1886–1891.45. Banerji S, Cibulskis K, Rangel-Escareno C, Brown KK, Carter SL, Frederick AM,Lawrence MS, Sivachenko AY, Sougnez C, Zou L, et al: Sequence analysis ofmutations and translocations across breast cancer subtypes. Nature 2012,486:405–409.46. Ellis MJ, Ding L, Shen D, Luo J, Suman VJ, Wallis JW, Van Tine BA, Hoog J,Goiffon RJ, Goldstein TC, et al: Whole-genome analysis informs breastcancer response to aromatase inhibition. Nature 2012, 486:353–360.47. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D,Lynch AG, Samarajiwa S, Yuan Y, et al: The genomic and transcriptomicarchitecture of 2,000 breast tumours reveals novel subgroups.Nature 2012, 486:346–352.doi:10.1186/1471-2164-13-719Cite this article as: Schulte et al.: Structural analysis of the genome ofbreast cancer cell line ZR-75-30 identifies twelve expressed fusiongenes. BMC Genomics 2012 13:719.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitSchulte et al. BMC Genomics 2012, 13:719 Page 11 of 11http://www.biomedcentral.com/1471-2164/13/719


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items