UBC Faculty Research and Publications

Increasing quality, throughput and speed of sample preparation for strand-specific messenger RNA sequencing Haile, Simon; Corbett, Richard D; MacLeod, Tina; Bilobram, Steve; Smailus, Duane; Tsao, Philip; Kirk, Heather; McDonald, Helen; Pandoh, Pawan; Bala, Miruna; Hirst, Martin; Miller, Diane; Moore, Richard A; Mungall, Andrew J; Schein, Jacquie; Coope, Robin J; Ma, Yussanne; Zhao, Yongjun; Holt, Rob A; Jones, Steven J; Marra, Marco A Jul 5, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2017_Article_3900.pdf [ 5.33MB ]
JSON: 52383-1.0366953.json
JSON-LD: 52383-1.0366953-ld.json
RDF/XML (Pretty): 52383-1.0366953-rdf.xml
RDF/JSON: 52383-1.0366953-rdf.json
Turtle: 52383-1.0366953-turtle.txt
N-Triples: 52383-1.0366953-rdf-ntriples.txt
Original Record: 52383-1.0366953-source.json
Full Text

Full Text

METHODOLOGY ARTICLE Open AccessIncreasing quality, throughput and speedof sample preparation for strand-specificmessenger RNA sequencingSimon Haile1, Richard D. Corbett1, Tina MacLeod1, Steve Bilobram1, Duane Smailus1, Philip Tsao1, Heather Kirk1,Helen McDonald1, Pawan Pandoh1, Miruna Bala1, Martin Hirst1,2, Diane Miller1, Richard A. Moore1,Andrew J. Mungall1, Jacquie Schein1, Robin J. Coope1, Yussanne Ma1, Yongjun Zhao1, Rob A. Holt1,Steven J. Jones1 and Marco A. Marra1,3*AbstractBackground: RNA-Sequencing (RNA-seq) is now commonly used to reveal quantitative spatiotemporal snapshotsof the transcriptome, the structures of transcripts (splice variants and fusions) and landscapes of expressedmutations. However, standard approaches for library construction typically require relatively high amounts of inputRNA, are labor intensive, and are time consuming.Methods: Here, we report the outcome of a systematic effort to optimize and streamline steps in strand-specificRNA-seq library construction.Results: This work has resulted in the identification of an optimized messenger RNA isolation protocol, a potentreverse transcriptase for cDNA synthesis, and an efficient chemistry and a simplified formulation of libraryconstruction reagents. We also present an optimization of bead-based purification and size selection designed tomaximize the recovery of cDNA fragments.Conclusions: These developments have allowed us to assemble a rapid high throughput pipeline that produceshigh quality data from amounts of total RNA as low as 25 ng. While the focus of this study is on RNA-seq samplepreparation, some of these developments are also relevant to other next-generation sequencing library types.Keywords: Ampure XP magnetic beads, Next-generation sequencing, Library construction, Strand-specific, dUTP,Uracil DNA N-Glycosylase, Ligation, mRNA, RNA-seq, Illumina, Ligation, Reverse transcriptaseBackgroundRevolutionary developments in Next-Generation Sequen-cing (NGS) technologies and bioinformatics have enabledthe pursuit of large scale genomics and functional genom-ics studies at increasingly reasonable cost and speed. Oneof the key factors in the generation of high quality NGSdata is the process of library construction.The main purpose of the multi-step library constructionprocess is to provide priming sites and sample-specificindexing barcodes for platform-specific sequencing reac-tions that are part of various NGS technologies. Typically,this requires ligation of DNA fragments to adapters thatcontain sequencing primers using T4 DNA ligase. Thisligase works optimally on double-stranded substrates [1]. Inthe case of RNA-seq, this means that double-strandedcDNA needs to be generated a priori. However, a standarddouble-strand cDNA synthesis would result in strand-agnostic RNA-seq data, which has been shown to have in-ferior accuracy of transcript quantification and annotation,and is devoid of the capacity to discern anti-sense RNAbiology compared to strand-specific RNA-seq (ssRNA-seq)[2–4]. To maintain strand-specific information, a widelyemployed ssRNA-seq protocol involves incorporation ofdUTP during second strand cDNA synthesis [5, 6]. Once* Correspondence: mmarra@bcgsc.ca1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, 675West 10th Avenue, Vancouver, BC V5Z 1L3, Canada3Department of Medical Genetics, University of British Columbia, Vancouver,BC V6T 1Z3, CanadaFull list of author information is available at the end of the article© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Haile et al. BMC Genomics  (2017) 18:515 DOI 10.1186/s12864-017-3900-6the double-stranded cDNA fragments are ligated toadapters, the dUTP-marked strand is selectively destroyedby the enzyme Uracil DNA N-Glycosylase (UNG).Library construction involves a series of enzymaticreactions, each typically followed by a purification step.A typical library sample preparation for RNA-seq, forexample, involves 6–8 such purification steps. Tradition-ally, these purification steps involved laborious processessuch as phenol-chloroform or column-based purifica-tions. The replacement of these purifications withparamagnetic bead-based solid phase reversibleimmobilization (SPRI) is a significant advance render-ing library construction simpler and more amenable tohigh throughput or robotic handling [7, 8]. The mostwidely applied and commercialized form of SPRI magneticbeads are Ampure XP beads.The SPRI technology is based on the reversible bind-ing of nucleic acids to carboxyl-coated paramagneticbeads in the presence of a buffer that contains high saltand polyethylene glycol (PEG). The technology is usedin the preferential purification of nucleic acids of acertain size range by adjusting the buffer composition.Reductions in salt and PEG result in selective enrichmentof larger fragments. This strategy has been exploited toreplace size selection that was traditionally performedthrough gel-based methods [9, 10].The vast majority of RNA mass for a given total RNAsample is comprised of a very few species of ribosomalRNAs (rRNAs) [11, 12]. Thus, removal of rRNAs is a com-mon first step in the sample preparation for RNA-seq.Since most rRNAs are not polyadenylated [13], processingof eukaryotic samples involves a positive selection of polya-denylated mRNAs by using the 3′-poly (A) tail as bait [14].Another approach is a negative selection where rRNAs aredepleted directly by using specific probes thereby main-taining informative, non-polyadenylated, non-ribosomalRNAs resulting in a more complete representation of thetranscriptome [15–17].In this study, gel and bead-based size selections arecompared and the optimal condition for maximum recov-ery in bead-based purifications is explored. In addition,the differential behaviour of single versus double strandedDNA/cDNA in bead-based size selection is demonstratedand the implication of this in strand-specific RNA-seq isshown. Other aspects that are addressed in this studyinclude assessment of mRNA isolation protocols, identifi-cation of a more robust reverse transcriptase to maximizecDNA yield, evaluation of various library construction kitsand optimization of ligation.MethodsRNA and genomic DNA samplesUniversal Human Reference (UHR) total RNA waspurchased from Stratagene (Catalog #740000). RNAwas quantified using Agilent RNA 6000 Nano Kit(Catalog #5067–1511). For some experiments, UHRwas spiked with External RNA Controls Consortium(ERCC) spike-in mix from Ambion (Catalog# 4456740).0.02 μL of the spike-in mix was used per 1 μg UHRtotal RNA.Genomic DNA (gDNA) was prepared from Humanpromyelocytic Leukemia cell line (HL60) using Phenol/Chloroform/Isoamyl Alcohol (25:24:1) phase extraction.gDNA from the tumor samples was prepared usingQiagen’s AllPrep DNA/RNA Mini Kit (Catalog#80204).DNA was quantified using Qubit dsDNA HS DNA assay(Thermo Fisher; Catalog# Q32851).mRNA isolation, cDNA synthesis and library constructionThe final form of our protocols as well as materials usedafter optimizations described in this paper are detailedin Additional files 1, 2 and 3 (mRNA isolation, cDNAsynthesis, and library construction).RNA sequencing and bioinformatic analysisMost RNA-seq Libraries were sequenced using paired-end 75 base (PE75) sequencing chemistry on HiSeq 2500instruments following the manufacturer’s protocols(Illumina). In some cases, RNA-seq libraries weresequenced with PE100/125 base reads but were subse-quently trimmed to 75 bases prior to alignment-basedanalysis. gDNA libraries were sequenced at PE125.RNA libraries were aligned against the hg19 referencein concert with Ensembl 61 gene annotations usingJAGuaR (PMID 25062255). Resulting BAM files werecoordinate-sorted with SAMtools (PMID 19505943) andduplicate reads were marked with Picard (http://broad-institute.github.io/picard/).To enable comparisons of metrics that were sensitiveto the number of reads used (gene detection and dupli-cate rates, etc) we randomly down-sampled each of thebams in our comparisons to a consistent number ofreads using Picard’s DownsampleSam command. Thedown-sampled bams were then duplicate read markedwith Picard and put through our in-house gene profilingsoftware to estimate the expression of each gene, intron/exon ratios, and transcript coverage bias. In some cases,the down-sampling was done multiple times to ensurethe stability of the results to random read sampling.Base error rates were estimated by counting mis-matches in the aligned reads. Using this approach, realSingle Nucleotide Variants (SNVs), including Single Nu-cleotide Polymorphisms (SNPs), would be incorrectlyidentified as base errors. However, the base error rate(0.002–0.004) was far higher than the SNP frequencyrate (0.001) and we are only interested in relative baseerror frequencies between samples.Haile et al. BMC Genomics  (2017) 18:515 Page 2 of 14Insert size distributions were estimated from readalignments to the mitochondrial genome. This approachprovided deep read coverage, avoiding the confoundingeffects of splicing.To measure error rates and/or evaluate the quantitativerange of transcript levels, we performed ERCC spike-inanalyses by re-aligning any read which either unaligneditself, or has an unaligned mate against the ERCC refer-ence. The re-alignments are performed using BWA mem(http://arxiv.org/abs/1303.3997) and requiring a highlyspecific match (−K 40).Strand-specificity was estimated by calculating the frac-tion of Ensembl 61 exon-exon junctions with 0 readsmapping to the anti-sense strand. We focus on readsspanning exon-exon junctions to avoid confounding readsthat may originate from genomic DNA background.Results and discussionThe focus of this study is the optimization and stream-lining of sample preparation for RNA-seq. Our workflowbefore and after these changes is depicted in Fig. 1. Inthe following sections, we describe the major aspects ofsuch optimization measures implemented to improvelibrary quality, automate steps in sample preparation,and streamline procedures to shorten the turnaroundtime at our sequencing facility. Unless otherwise noted,the experiments discussed in this study use UniversalHuman Reference (UHR) total RNA (Stratagene) asinput. UHR RNA is a mixture of RNA prepared from 10different cancer cell lines to represent the majority ofRNA species expressed in various tumors.Of note, the order in which data is discussed below isdifferent from what is presented in the workflowdepicted in Fig. 1 to enable accurate assessment of data.Improving cDNA yieldWe have generated thousands of RNA-seq libraries usingthe widely employed reverse transcriptase Superscript II(SSII). We noted a decline in the performance of theenzyme overtime, and observed an inverse correlation be-tween the amount of SSII and cDNA yield with increasedyield resulting in better library quality (Additional file 4:Figure S1). This occurred despite the fact that all enzymeamounts used were at <10% of reaction volume,Fig. 1 Workflow of ssRNA-seq pipeline at our facility. On the left is the previous version of our pipeline and the on the right is the new version.Red font denotes steps which are removed in the new version and blue font represents process or reagent modificationsHaile et al. BMC Genomics  (2017) 18:515 Page 3 of 14which as a general rule, is the maximum propositiontypically used to avoid the inhibitory effect of glycerolin the enzyme storage buffer. We adopted 0.5 μL ofenzyme per 25 μL reaction for routine constructionof ssRNA–seq libraries and we used this amount insubsequent experiments in this study.The issues observed with the SSII prompted us tolook for alternative reverse transcriptases (RTs).Screening of seven RTs from various vendors (datanot shown) led to the identification of the Maxima HMinus (Maxima) from Thermo Fisher Scientific as themost potent reverse transcriptase in our reaction con-ditions. Figure 2a shows that the Maxima enzymeproduces higher cDNA yield relative to SSII. Further,the percentage of reads aligned for libraries producedusing Maxima was higher, or comparable, to thosegenerated by SSII (Additional file 5: Table S1). Allmajor sequence quality metrics indicated that the useof this RT allows the production of higher qualitydata (Fig. 2b-c; Additional file 5: Table S1). DuplicationFig. 2 The Maxima H Minus reverse transcriptase provides higher yield of cDNA and quality of libraries. a cDNA yield assessment. X-axis indicatesvarious UHR RNA input amounts used for mRNA isolation and cDNA synthesis. Double strand cDNA was measured using the Qubit HS DNA assay.Values from this assay were normalized relative to the value obtained when using Superscript II (RT-II) for the 250 ng input. b Diversity of libraries.Libraries were generated from cDNA samples that were prepared using the best performing RT (Maxima) and Superscript II (SS-II). The resultingsequencing data were analyzed for duplicate rates. c ERCC spike-in sequence differences. Mismatch rates were calculated by comparing observedsequences and expected sequences from the known spike-in synthetic RNAs. X-axis represents various UHR RNA input amounts used for mRNAisolation and cDNA synthesis. Y-axis is error rate per 1000 nucleotides. n = 3; error bars = Standard Deviation. *P < 0.05. P values were calculatedusing Student’s t-test (unpaired and equal variance)Haile et al. BMC Genomics  (2017) 18:515 Page 4 of 14rate, for example, was reduced ~1.8 fold (Fig. 2b); suggest-ing increased transcript diversity. This trend applied to allinput amounts tested between 25 ng and 1000 ng totalRNA input.To assess RT-associated base error rate, we tookadvantage of the ERCC spike-in RNA mix that wasinitially developed by the External RNA ControlsConsortium [18, 19]. This mix is comprised of 92 indi-vidual synthetic RNAs of known sequence and concen-tration aimed at representing eukaryotic transcripts ofvaried lengths. The spike-in RNAs are added to theUHR RNA prior to the mRNA isolation step. Sequencedivergence of ERCC transcripts observed in our datacould, therefore, be attributed to an aggregate intrinsicerror rate of the enzymes involved in cDNA synthesis,library construction and sequencing. By making the RTenzyme the only variable, we could assess the fidelity ofRT enzymes. We also estimated error rate by looking atthe UHR whole transcriptome data where divergence ofsequence was measured relative to a compendium of hu-man reference sequences comprised of known single nu-cleotide polymorphisms (SNPs). Both assessmentssuggested the degree of fidelity between Superscript IIand Maxima H Minus is not significantly different(Fig. 2c; Additional file 6: Figure S2).Gel versus bead-based size selectionAdapter dimers and unwanted smaller fragments aretypically avoided using gel-based size selection, which isa laborious process. We compared bead-based size selec-tion to gel-based size selection in the context of ourssRNA-seq pipeline. The ratio of beads to DNA in thebead-based size selection was 1:1 (on volume to volumebasis) and the purification was performed twice. Thegel-based size selection targeted 280-400 bp of PCR-enriched library fragments. Library yield was similarbetween the two conditions (data not shown) despite thebroader profile from the bead-based condition (Fig. 3a).This suggested that there is a significant loss of DNAassociated with the bead purification, which we addressbelow. Despite these differences, all libraries exceededthe minimum sequencing requirements and were eval-uated based on alignment –based sequencing qualitymetrics (Additional file 5: Table S2). Various sequen-cing metrics indicated that library quality is compar-able between these two conditions (Additional file 5:Table S2; Fig. 3b).Optimal bead-DNA binding time is dependent on DNAamountThe manufacturer specified binding time for AmpureXP beads to DNA is 5 min in the context of the purifica-tion of PCR products with 1.8 to 1 beads to DNA ratio.We sought to address whether the optimal binding timevaries depending on DNA amount, size and the reactioncomposition. We chose to investigate this within a typ-ical process that generates PCR-enriched gDNA librarieswith a target of 200-250 bp insert size. The generationof such libraries involves 7 purification steps includingbead-based size selection. The post-ligation purificationsinvolved two rounds of purifications with 1:1 beads toDNA ratio and further served as size selection steps byremoving unwanted small gDNA fragments, excessadapters, and adapter dimers enriching for >200 bp frag-ments. Similarly, the post-PCR purifications involvedtwo rounds of purifications with a 1:1 ratio of beads toDNA. The pre-ligation steps were performed with a 2:1ratio so that maximum recovery of DNA was achieved.gDNA input amounts for library construction were 1 ng,30 ng, 100 ng and 1000 ng and bead binding time pointsevaluated were 3 min, 5 min, 10 min, and 15 min. TheFig. 3 Bead versus gel-based size selection. a Insert size. Size profileswere based on reads that mapped to the human mitochondrialgenome. b Differential gene expression between two conditions.DE-seq plots show genes (in red dots) that were differentiallyexpressed at a statistically significant level (FDR ≤ 0.1) as in [21].n = 3 (replicate libraries)Haile et al. BMC Genomics  (2017) 18:515 Page 5 of 14cumulative effect on all the purifications up to the sec-ond round of post-ligation purification (Fig. 4a) and upto the second round of post-PCR purification (Fig. 4b)was assessed. A fifteen minute binding time was foundto be the most optimal for all input amounts for boththe pre-PCR and post-PCR assessments. In the case ofthe pre-PCR assessment, the difference was statisticallysignificant (p < 0.05) but marginal for input amounts upto 100 ng in contrast to the 1000 ng input amountwhere 15 min binding gave ~2-fold higher recovery rela-tive to the 3 min binding time point. This suggests thathigher DNA amounts require a longer binding time toachieve maximum recovery. The number of PCR cycleswas adjusted considering input amounts and the PCRproducts were in the microgram range for all inputamounts. Consistent with the pre-PCR data for the1000 ng input, the post-PCR data also showed a similartrend where 15 min binding resulted in ~1.6 to 2-foldhigher recovery relative to the 3 min binding time pointfor all input amounts. The trends shown in Fig. 4 werealso observed for larger DNA fragments and they persistirrespective of reaction composition (data not shown).We also compared the starting DNA amount with theamount recovered after a single purification in the contextof the post-PCR purification and found that recovery ratewas at ~95% for the 15 min binding time (data notshown). The empirical data presented here shows thatrecovery of DNA is enhanced by longer incubation times.Bead-based purification/size selection after UNG digestionFor post-ligation purifications, two rounds of purifica-tions with 1:1 ratio are commonly applied for maximalremoval of adapter artifacts (mainly dimers). The firstpurification step is inefficient in removing the unwantedfragments as the ligation buffer contains very highamount of PEG. Additional file 7: Figure S3A shows thefinal library profiles relative to the sheared material forthe gDNA protocol. As discussed above, the strand-specific RNA sequencing protocol that is widelyemployed involves the dUTP marking and subsequentdestruction of the second strand cDNA via UNG [5, 6].To be able to have a streamlined workflow, we firstaimed at splitting the two rounds of purifications withone round being after ligation and the other after UNGdigestion. The final library profiles relative to thesheared material for ssRNA-seq are shown in Additionalfile 7: Figure S3B. Relative to the gDNA library profiles,we observed that the final libraries from the ssRNA-seqprotocol were larger in size despite the profiles of thesheared material being similar. These data suggested thatthe 1:1 purification might be resulting in removal ofrelatively larger fragments when applied after UNGdigestion. To directly demonstrate this, we looked atvarious combinations of 1:1 versus 2:1 ratios for bothpost-ligation and post-UNG purification steps. As shownin Fig. 5a, both conditions where we observed largerlibrary sizes involve 1:1 post-UNG. There was no signifi-cant difference in size between 1:1 and 2:1 post-ligationwhen post-UNG was held constant at 2:1. Importantly,Fig. 4 Bead binding time point analysis. a Pre-PCR assessment.Various gDNA input amounts (X-axis) were used and libraries weremade where the binding time for each of the bead cleanups wasvaried. The cumulative effect after all cleanups up to the point ofpost-ligation cleanups is shown. The purified ligated DNA wasmeasured using a Qubit HS assay. The values from this assay werenormalized to that of the 15 min condition. b Post-PCR assessment.As in (a) but purified DNA was measured after PCR enrichment. i.e.after additional two post-PCR bead-based purifications. n = 3; errorbars = Standard Deviation. *P < 0.05Haile et al. BMC Genomics  (2017) 18:515 Page 6 of 14libraries were larger in 1:1 post-UNG versus 2:1 post-UNG when post-ligation was held constant at 2:1. Thesedata confirm that purification with 1:1 beads to DNAratio is not optimal for purification after UNG digestion.Identification of an optimal UNG reaction protocolOne way to avoid 1:1 post-UNG purification is to per-form two rounds of purifications with 1:1 ratio post-ligation and 2:1 ratio post-UNG. However, this adds anadditional bead purification step. Alternatives that avoidthis extra step include: (a) using the UNG reaction as atemplate for PCR without purification in between; (b)including the UNG in the reaction mix for PCR and (c)omitting the UNG enzyme all together. A potentialcaveat for (a) is that impurity might compromise PCRefficiency. Additional concern for (b) relates to theupstream reaction and deactivation incubations for theUNG reaction that might compromise PCR efficiency.The theoretical expectation for (c) to work and stillachieve strand-specificity is the inability of the highfidelity PCR enzymes (Phusion in our case) to amplifydU containing template. The extent of background amp-lification might, however, be too high to achieve accept-able strand-specificity. We evaluated these conditions inparallel at the levels of library yield, library size profilesand sequencing data.As expected, all three conditions displayed similar librarysizes to those where distinct UNG reaction was applied and2:1 post-UNG purification was performed (Fig. 5b). Ofnote, the 1:1 post-UNG condition had largest library size(Fig. 5b) and lowest library yield (Fig. 5c). The next lowestyield was unexpectedly from the condition where UNGFig. 5 Post-UNG bead-based purifications: library yield data. a Effect of bead to DNA ratio on library sizes. Various combinations of 1:1 and 2:1bead to DNA ratio were applied for post-ligation and post-UNG purifications. The final purified PCR product was run on Agilent DNA 1000 forsize profiling. b Size profiles of libraries made using variations of the UNG step. The first three conditions where bead amount was varied forpost-ligation and post-UNG purifications involve a distinct UNG step. The fourth condition also has a separate UNG step but the reaction is usedas a template for PCR without purification in between. The next condition combines UNG and PCR reactions where as the last condition omitsthe UNG treatment all together. The sizes of these libraries were calculated from fragment smear analysis using Agilent’s software. Input was 1μgUHR total RNA. c Yield comparison of libraries made using various formats of the UNG step. As in (b) but endpoint data is concentration of thefinal libraries. n = 3; error bars = Standard DeviationHaile et al. BMC Genomics  (2017) 18:515 Page 7 of 14was excluded. The decrease in yield was even more severefor this condition upon further increase of the amount oftemplate for PCR (Additional file 8: Figure S4). The otherconditions had comparable yield to each other. Insert sizewas estimated bioinformatically based on mitochondrialtranscripts. Consistent with the electrofluorogram data, the1:1 post-UNG condition had the largest insert size whereasthe others were comparable to each other (Fig. 6a). Tocompare the quality of these libraries, diversity was inferredfrom duplicate rates (Fig. 6b). Again, the 1:1 post-UNGcondition stood out with highest duplicate rate and theothers were comparable to each other. A similar trend wasobserved for the percentage of reads aligning tomitochondria (Fig. 6c). All libraries had >99% strand-Fig. 6 Post-UNG bead-based purifications: Sequencing data. a Bioinformatic insert size profiles correlate with lab level data. The same librariesdescribed in 5B and 5C were sequenced. Other post-alignment assessments included % duplicates (b), % Mitochondrial (c) and strand- specificity(d). n = 3; error bars = Standard DeviationHaile et al. BMC Genomics  (2017) 18:515 Page 8 of 14specificity even though the condition where UNGwas omitted had lower strand-specificity by ~0.3%(Fig. 6d). In contrast, strand-specificity was calculatedto be 2.5% for strand-agnostic libraries (Additionalfile 9: Figure S5). These results collectively suggestthat the inclusion of UNG in PCR is the most optimalamong the conditions tested.Comparison of various library construction kitsOur current protocol for library construction includes abead purification step after each of the library constructionreactions: end-repair, A-addition and ligation (Fig. 1). Thereagents for our protocol are sourced from New EnglandBiolabs (NEB). Recently, various commercial kits haveemerged in which one or more of the bead purifications areremoved and in some cases, two reactions are coupled intothe same step. NEB has three different products represent-ing these workflows (Fig. 7a). To evaluate the trade-off be-tween library yield and streamlined workflow, theseproducts were compared using our PCR-free protocol start-ing with the same sheared and size-selected HL60 gDNA.The first product employs bead clean ups after each of thelibrary construction steps; the second removes the beadpurification after A-tailing; and the third couples end-repaira bcFig. 7 Identification of optimal library construction chemistry and ligation condition. a Workflows of three categories of library constructionchemistries. Work flow-A has cleanups after every step of library construction and in Work flow-B the cleanup after A-addition is removed whereas in the most-streamlined Work flow-C end-repair and A-addition are coupled into one reaction and the cleanup after A-addition is removed.b Comparison of library yield between the three NEB workflows. PCR-free libraries were generated using the chemistries that represented theworkflows depicted in (a) using two different amounts of gDNA as inputs. qPCR was applied to measure the final library yield. c Optimizationof ligation. For the best performing chemistry (NEB workflow-B), ligation time point analysis was performed by varying the adapter amount. Thiswas performed using our ssRNA-seq pipeline. n = 3; error bars = Standard DeviationHaile et al. BMC Genomics  (2017) 18:515 Page 9 of 14and A-tail reactions and enables ligation without a cleanupin between. The second product has the reaction compo-nents in one premix and, importantly, has a proprietaryligation enhancer component. This formulation gave thehighest yield (Fig. 7b) with comparable sequencing dataquality to the others (Additional file 5: Table S3). The yieldusing this chemistry was higher than that of others fromdifferent vendors that are representative of the threework flows (data not shown). The minimum inputamount tested for PCR-free libraries was 250 ng usingthe newly optimized protocol. The robustness of theprotocol has led to an ongoing assessment of evenlower input amounts for the PCR-free genome pipeline.Beyond HL60, this new protocol was shown to be morerobust for 3 other gDNA sources from different clinicalsamples (Additional file 10: Figure S6). While thecurrent manuscript was in revision, NEB has releasedversion 2 (NEBNext Ultra II) of their most stream-lined workflow. We have not assessed this productfully even though our preliminary data suggests thatlibrary yield obtained using this protocol is compar-able to the yield attained using the most optimalprotocol we identified in Fig. 7 (Additional file 11:Figure S7). Of note, these comparisons did notinclude the respective PCR modules of the variouslibrary construction kits and instead we applied ourcurrent PCR module using Thermo Fisher’s Phusionenzyme to all protocols.Optimization of ligation time and adapter amountVirtually all commercial kits for library construction employa short ligation time (~15 min). We have observed that evenwith some of these kits, longer ligation times improve libraryyield significantly with proportional increase in library diver-sity (data not shown). However, ligation time could be re-duced without drastic loss in library yield when a higheradapter to insert ratio is applied (Fig. 7c). Based on thesedata, we have chosen a 15 min optimized ligation with theNEB premix library construction chemistry to be our pre-ferred protocol due to the improved workflow and increasedlibrary yield. These data also suggests that ligation timeneeds to be controlled so that ‘batch effects” due to yield(and hence library diversity) are avoided.Optimal mRNA isolationThe mRNA isolation technique we have been employing isthe MultiMACS procedure (Miltenyi Biotec, Germany),which uses magnetic oligo-dT beads in a 96 mini-columnarray format [20]. We have successfully used this techniqueand over the years have further optimized and automatedthe procedure (Additional file 12: Figure S8, Additionalfile 13: Figure S9 and Additional file 14: Figure S10).This protocol gave high quality libraries for totalRNA inputs that ranged between 500 ng and5000 ng (Additional file 15: Figure S11). However,the major hurdle in lowering the required input fur-ther was high rRNA content. For input amounts between25 ng to 200 ng, rRNA content varied between 10% and60% in contrast to higher input range which ranged 0.2–10% (Additional file 5: Tables S4 and S5).To evaluate alternative mRNA isolation methods withthe aim of reducing ribosomal RNA content, we com-pared our latest MultiMACS-based protocol with NEB’smRNA module and Biooscientific’s NEXTflex mRNAisolation protocol. UHR RNA input amounts testedranged from 25 ng to 1000 ng. For all inputs tested, thetwo newer protocols gave much higher cDNA yield forthe 1000 ng input relative to the modified MultiMACSprotocol (Additional file 16: Figure S12A). Assessmentof 18S rRNA to GAPDH mRNA mean expression ratioby qPCR indicated that the two protocols had muchlower rRNA content with the NEB protocol providingthe lowest values (Additional file 16: Figure S12B). En-couraged by these data, we made libraries representingthese protocols at the various input amounts and se-quenced them in a lane of HiSeq 2000. All library qualitymetrics favored all three protocols for input amounts500 and 1000 ng (Additional file 5: Table S5). Consistentwith the qPCR data, % ribosomal RNA was <5% for bothof the new protocols with <500 ng input range whereasthe multiMACS protocol gave 48–68% ribosomal forthese input amounts (Additional file 5: Table S5). TheNEB protocol was at 0.1–0.5% rRNA for all inputamounts where as the NEXTflex protocol gave 0.2–4%rRNA for all input amounts with the higher end repre-senting lower input libraries (Additional file 5: Table S5).Consistent with the data on cDNA yield, the librariesfrom the two new protocols also gave generally higherlibrary quality in terms of transcript diversity and otherquality metrics (Additional file 5: Table S5). Since the %ribosomal was consistently lower for all input amountswith the NEB protocol, this was chosen as our finalmRNA isolation module.The new pipeline produces high quality libraries from 25to 1000 ng Total RNA inputThe aforementioned optimizations led to a new pipeline.We next assessed whether the lower input (e.g. 25 ng) li-braries are indeed qualitatively comparable to the higherinput (e.g. 1000 ng) libraries when using this pipeline. Rela-tive proportions of reads that mapped to exonic, intronicand intergenic regions were comparable between the inputamounts (Fig. 8a). All libraries displayed >95% alignmentrates (Additional file 5: Table S5), which was comparableacross the input amounts tested as low as 25 ng (Fig. 8b).All libraries had >98% strand-specificity with comparablebase error rates (Additional file 5: Table S5). 3′-end bias isone caveat that is usually associated with poly (A) capture-Haile et al. BMC Genomics  (2017) 18:515 Page 10 of 14based protocols. 5′-end to 3′-end ratios were calculated tomeasure the extent of this bias and were found to bewithin 0.87–1.1 for all the libraries (Additional file 5:Table S5). While we expect these numbers to bedifferent for RNA samples with variable quality, theseresults suggest that the lowering of the input amountper se is not going to be a significant factor. The onlymetric where we saw a significant difference is % duplicates,which was ~1.5-fold higher for 25-200 ng input amounts ascompared to 500-1000 ng input range (Fig. 8b).One major application of RNA-seq is quantification ofgene expression at the transcript level. Thus, we wantedto investigate if the low input libraries represented therelative amounts of transcripts of various expressionlevels rather accurately. To gauge this, we undertooktwo approaches. First, we calculated the correlation ofexpression between the different input amounts for theentire complement of the UHR transcriptome. Second,we evaluated how well the observed levels of ERCCspike-in synthetic RNAs correlated with the “groundtruth” expected levels. For the latter, we spiked UHRRNA with ERCC spike-in proportional to the inputamount at a constant final % value. The spiking wasdone just before mRNA isolation and thus the readouton the ERCC data is indicative of the aggregate of allprotocols applied starting with the mRNA isolationmodule. Pair-wise Pearson’s correlation values indi-cated very high correlation of expression levels be-tween the various input amounts (Additional file 17:Figure S13, upper panel). Average correlation relativeto the 1 μg input was 0.987 for the 25 ng input, 0.989for the 100 ng input, 0.989 for the 200 ng input and0.992 for the 500 ng input. ERCC spike-in correlationwas also in agreement with this UHR data withexpression correlation of observed vs expected being0.94–0.98 for each of the RNA input amounts(Additional file 17: Figure S13, lower panel). Thecorrelation values within technical replicates for eachinput amount was 0.989–0.999 for the UHR data(Additional file 17: Figure S13, upper panel), suggesting avery high reproducibility of the new pipeline.We next wanted to validate the new ssRNA-seq pipelinefor lower RNA input beyond UHR using RNA from tumorsamples. To do this, we tapped into a set of human clinicaltumour samples which were obtained as part of an ongoingpersonalized oncogenomics project at our centre [14].100 ng total RNA from twelve such samples, representing avariety of tissues, was used as a starting material. Some ofthe tissues were embedded in Optimal Cutting Temperaturecompound (OCT) while others were from fresh frozen spec-imens. The resulting libraries from each of the 12 sampleswere at an order of magnitude or higher relative to the mini-mum concentration required for standard Illumina sequen-cing (Fig. 9a). Post-alignment metrics of the sequencing dataare shown in Additional file 5: Table S6. Included in theseanalyses is also data from libraries that were previously gen-erated from the same samples from higher Total RNA input(2000 ng). The high input libraries were generated using thesame protocols with the exception that mRNA isolation wasaccording to the MultiMACS procedure described above.Additional file 5: Table S6 shows that the duplicate rate, %rRNA, and % intergenic is lower for the newer libraries.However, 3′-end bias appears to be slightly higher for thenewer libraries with the average 5′/3 value being 0.87 versus1.05. This is not due to input RNA quality differences as theRNA integrity was comparable (Additional file 18: FigureS14) even though degradation during subsequent DNasetreatment and mRNA isolation steps cannot be ruled out.Input amount difference (100 ng versus 2000 ng) may beplausible proximal factor as we have previously observedFig. 8 UHR total RNA input titration using the new ssRNA-seq pipeline. a Comparable mapping of reads to various transcriptome catagories.b Other alignment-based metrics. Y-axis represents the value for each of the inputs divided by that of the 1000 ng for a given metricHaile et al. BMC Genomics  (2017) 18:515 Page 11 of 14upon UHR input titration that 5′/3′ values are variable de-pending on the input amount (Additional file 5: Table S5).An alternative explanation is that 5′/3′ may be confoundedby non-specific capture of RNAs during mRNA isolation.Indeed, average % rRNA value and % intergenic were 9.4%and 3.8%, respectively, for the previous libraries (Additionalfile 5: Table S6). In contrast, the new libraries displayed0.09%and 2.2%, respectively. Consistent with this possibilityis the observation that the 5′/3′ values are reduced by aconstant proportion (~ 20%) (Fig. 9b). This is despite therange of the ratios for the previous libraries being as di-verse as 0.86 and 1.5 (Additional file 5: Table S6).Expression of thousands of genes is shown to display a veryhigh correlation between the two sets of libraries with Pear-son’s correlation coefficient of 0.93–0.97 (Fig. 9b) where as ex-pression correlation values using the same input and protocolfrom different tumor types were as low as 0.5. Furthermore,expression correlation between the same tumor types rangedbetween 0.83 and 0.91 for both input amounts/protocols.Additional support for the notion that our new protocol pro-vides expression data that is biology-driven (as opposed toprotocol/input amount- driven) comes from hierarchicalclustering data that shows segregation of clades in a sample-dependent manner (Additional file 19: Figure S15).ConclusionsOver all, we have described here a systematicoptimization and evaluation of various modules inssRNA-seq sample preparation for Illumina sequen-cing that included mRNA isolation, cDNA synthesis,bead-based purifications and size selection as well aslibrary construction chemistry. The outcome of thesemultifaceted improvements has enabled a stream-lined, high throughput pipeline that typically yieldshigh quality libraries at tens of nano-molar concen-tration with 10–13 cycles of PCR from as low as25 ng total RNA input. Taken together, the pipelinedescribed here will allow the rapid processing of lowinput samples for transcriptomic studies.Additional filesAdditional file 1: mRNA isolation. (DOCX 75 kb)Fig. 9 Evaluation of the new ssRNA-seq pipeline using RNA from tumor samples. 100 ng total RNA from 12 different tumor samples was used asinput to generate libraries using the new protocol. Adapter-ligated libraries were enriched with 13 cycles of PCR. a Library yield. b Correlation ofexpression. Pearson’s correlation coefficient was calculated pair-wise showing higher correlation between the new lower input libraries and theirprevious higher input counterpartsHaile et al. BMC Genomics  (2017) 18:515 Page 12 of 14Additional file 2: cDNA synthesis. (DOCX 159 kb)Additional file 3: library construction. (DOCX 129 kb)Additional file 4: Figure S1. Superscript II titration. 3.5 μg UHR wasused as input Total RNA followed by multiMACS mRNA isolation. Variousamounts of Superscript II were used in a 25 μL total reaction volume of1st strand synthesis. “New” and “old” denote two different lots of theenzyme. (A) cDNA yield. (B) Duplicate rate. Strand-specific libraries weregenerated from the cDNA samples shown in (A). Libraries were pooledand sequenced using PE75. 20 million reads from each of the librarieswere sampled for calculating duplicate rate. (JPEG 31 kb)Additional file 5: Table S1. Alignment-based metrics for reversetranscriptase comparisons. Table S2. Alignment-based metrics for thesize selection experiment. Table S3. Alignment-based metrics for thecomparison of library construction kits. Table S4. Alignment-basedmetrics for the intermediate ssRNA-seq pipeline. The protocol evaluatedincludes all changes (1st strand cDNA synthesis, optimal bead purifications,new library construction chemistry with modified ligation condition, bead-based size selection, and UNG treatment) with the exception of the mRNAisolation improvements. Table S5. Alignment-based metrics for the finalssRNA-seq pipeline using UHR. Table S6. Alignment-based metrics for thefinal ssRNA-seq pipeline using tumor samples. (XLS 62 kb)Additional file 6: Figure S2. Reverse transcriptases and UHR sequencedivergence. UHR whole transcriptome data was assessed for divergence ofsequence relative to a compendium of human reference comprised of knownsingle nucleotide polymorphisms (SNPs). n = 3; error bars = Standard Deviation.(JPEG 24 kb)Additional file 7: Figure S3. Larger strand-specific library sizes aregenerated using 1:1 post-ligation and post- UNG bead-based purifica-tions. (A) gDNA libraries. Upper panel is a trace representing size profileof sheared HL60 gDNA (replicates are overlayed). 30 ng was used for200 nt-gap library construction where post-ligation clean up was per-formed twice with 1:1 ratio of bead volume to DNA volume. Lower panelis size profile of final libraries. (B) Strand-Specific cDNA libraries. Upperpanel is a trace representing size profile of sheared double-strand cDNA.cDNA was generated from 2 μg UHR total RNA followed by multiMACSmRNA isolation. Post-ligation clean up was performed once with 1:1.Post-UNG was also once with 1:1. Lower panel is size profile of final librar-ies. Note that the strand-specific cDNA libraries are significantly largerthan the gDNA libraries despite the sheared starting material being ofsimilar size profile. (JPEG 50 kb)Additional file 8: Figure S4. UNG digestion improves library yield.Libraries were made from the indicated UHR input amounts in thepresence or absence of UNG in the PCR reaction. UNG from two differentvendors was used. n = 2. (JPEG 26 kb)Additional file 9: Figure S5. The performance of our tool forcalculating strand-specificity. (JPEG 23 kb)Additional file 10: Figure S6. Library construction chemistrycomparison using tumor samples. Two NEB workflows were evaluated.The first has bead clean up after each of the library construction steps(A); the second has the bead cleanup after A-tail removed (B). InputgDNA was from three different clinical samples (Source 1–3). (JPEG 27 kb)Additional file 11: Figure S7. Library yield is improved with the lateststreamlined NEB library construction protocol (NEBUltraII). 100 ng HeLagDNA was used as input and 6 cycles of PCR was applied. PCR modulewith Phusion enzyme was the same for both. NEB-B is the most optimalprotocol identified as shown in Fig. 7. (JPEG 20 kb)Additional file 12: Figure S8. Oligo-dT bead titration for MultiMACSmRNA isolation and its effect on % rRNA and mRNA/cDNA yield. (A) %rRNA as assessed by qPCR. Oligo-dT bead amount was titrated. Indicatedamounts are expressed relative to manufacturer specified amount (1×).qPCR was applied to measure levels of 18S rRNA and GAPDH mRNA. Theratio of 18S to GAPDH levels was calculated based on the Pfaffl method(Pfaffl MW, 2001) and is shown graphically. (B) Duplicate rate upon se-quencing. For selected bead amounts, libraries were generated andsequenced. Y-axis is ratio of duplicate rate calculated relative to themanufacturer specified amount (1×). (C). Oligo-dT bead amount andcDNA yield. Of note, this represents the latest lots of the beads whichappear to have significantly underperformed where the effect of lowerbead amount is most drastic. (JPEG 56 kb)Additional file 13: Figure S9. Workflow and automation of variousMiltiMACS mRNA isolation formats and on-column cDNA synthesis.Streamlined MultiMACS that does not require precipitation before cDNAsynthesis (no ppt) versus another version that requires precipitation (ppt)as well as on-column cDNA synthesis are depicted in (A). In (B), theintegration of the MutiMACS unit with Hamilton Nimbus Microlab liquidhandler is shown. (JPEG 66 kb)Additional file 14: Figure S10. Comparison of various MiltiMACS mRNAisolation formats and on-column cDNA synthesis. Streamlined MultiMACS thatdoes not require precipitation before cDNA synthesis (no ppt) versus anotherversion that requires precipitation (ppt) as well as on-column cDNA synthesiswere compared. For the latter, two different cDNA synthesis mixes were used:one that comes with the Miltenyi kit that involves dT-priming and another thatis based on our protocol which is spiked with random hexameres (in-house).(A) Detection of gene expression. Number of genes whose expression is de-tected at various degrees of coverage is shown for all the four conditions. (B) 3′-end bias. X-axis values represent proportion of genes whose coverage shows acertain degree of asymmetry. When the proportion with asymmetric coverage(x) is 0, there is no bias at all. When x > 0, there more bias towards the 3′-endand vice versa. (C) Mapping to various transcriptomic regions. n = 3.(JPEG 49 kb)Additional file 15: Figure S11. Total RNA input titration using theintermediate ssRNA-seq pipeline. The protocol evaluated includes allchanges (1st strand cDNA synthesis, optimal bead purifications, newlibrary construction chemistry with modified ligation condition, bead-based size selection, and UNG treatment) with the exception of themRNA isolation improvements. (A) Comparable mapping of reads tothe human genome. (B) Comparable mapping of reads to varioustranscriptome catagories. (C) High correlation of expression betweenlower input and higher input libraries. Pearson’s correlation coefficientwas calculated pair-wise as indicated. Heat map was generated forthe resulting values with color intensity representing the degree ofcorrelation as per the depicted legend. n = 3 for all inputs except for2μg and 5μg input amounts where only duplicates were represented.Error bars = Standard Deviation. (JPEG 27 kb)Additional file 16: Figure S12. Comparison of mRNA isolation kits. Themodified MultiMACS protocol with 1/16th bead amount was comparedto two other kits (NEXTflex and NEB). Various UHR inputs were used. (A)cDNA yield. (B) % rRNA as assessed by qPCR. Details are as in Additionalfile 12: Figure S8. All three are shown in the left panel. Right panel showsjust the NEXTflex and NEB on a different scale. (JPEG 37 kb)Additional file 17: Figure S13. High correlation of expressionbetween lower input and higher input libraries. Upper panel is for UHRdata. Lower panel: High correlation of observed versus expected levelsof ERCC spike-in synthetic RNAs. Pearson’s correlation coefficient wascalculated pair-wise among the libraries (UHR data) and between whatwas measured in the libraries versus theoretically expected levels forthe ERCC RNAs. n = 2. (JPEG 50 kb)Additional file 18: Figure S14. RNA integrity comparisons betweenthe input RNAs for the previous and newer libraries. RNA integrity (RIN) isbased on Agilent RNA Nano assay. (JPEG 25 kb)Additional file 19: Figure S15. Hierarchical clustering data that showssegregation of clades in a sample dependent manner as opposed tosegregation by input amount RNA. Samples are as in Fig. 9. LI = lowinput; HI = high input. (JPEG 50 kb)AbbreviationsERCC: External RNA Controls Consortium; mRNA: Messenger RNA;NGS: Next Generation Sequencing; RNA-Seq: RNA sequencing;rRNA: Ribosomal RNA; RT: Reverse Transcriptase; SSII: Superscript II;ssRNA-seq: Strand-specific RNA-seqAcknowledgmentsWe are grateful for the contribution from members of the various groups of theTechnology Platforms at the Genome Sciences Centre, BC Cancer AgencyHaile et al. BMC Genomics  (2017) 18:515 Page 13 of 14including Biospecimen, Quality Assurance, Library Construction,Instrumentation, Sequencing, LIMS, Systems, Project Management, Purchasing,Lab Operations, Administration Support, and Bioinformatics teams.FundingMAM acknowledges support from the Canadian Institutes of Health Research(FDN-143288). This work was also supported by the BC Cancer Foundation,and grants from Genome Canada/Genome British Columbia, WesternEconomic Diversification Canada and Canada Foundation for Innovation andthe National Cancer Institute, National Institutes of Health, under ContractNo. HHSN261200800001E. The content of this publication does notnecessarily reflect the views of policies of the Department of Health andHuman Services, nor does mention of trade names, commercial products, ororganizations imply endorsement by the U.S. Government.Availability of data and materialsRNA-seq data was deposited into the National Center for BiotechnologyInformation's Sequence Read Archive with the identification SRP109672. Allother data supporting the findings is contained within the manuscript.Authors’ contributionsMAM, SJJ, and RAH conceived and oversaw the technological developmentprojects in this study as principal investigators. SH designed and performedexperiments, and analyzed/interpreted data for all aspects of the study withthe assistance from TM in conducting the library construction chemistrycomparisons and the optimization of bead-based purifications under thesupervision of YZ. HK, HM, and PP technically assisted with some of theexperiments. DS and PT scripted programs for liquid handlers under thesupervision of RJC. RDC and SB performed computational analysis ofsequencing data under the supervision of YM. SH and MAM wrote themanuscript with edits and feedback from RDC, YZ, TM, AJM, SJ, RAM,and YM. MB supervised quality control aspects and DM facilitated projectmanagement. AJM, DM, DS, JS, MB, MH, RAM, RDC, RJC, SB, SH, YM and YZdiscussed and monitored overall scientific progress of experiments. Allauthors approved of the final version of the manuscript.Ethics approval and consent to participateSome of the patient samples used in this study are part of the BC CancerAgency’s Personalized Oncogenomics project [14], which was approved bythe University of British Columbia Research Ethics Committee (REB# H12–00137). Sample use was according to the written consent by each patient.Patient identity was made anonymous and sequence data and analysesthereafter were maintained within a secure computing environment.Human promyelocytic Leukemia cell line (HL60) was purchased fromCedarlane Laboratories LTD.Consent for publicationNot applicable.Competing interestsThe authors declare that they have no competing interests.Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, 675West 10th Avenue, Vancouver, BC V5Z 1L3, Canada. 2Department ofMicrobiology and Immunology, Michael Smith Laboratories Centre forHigh-Throughput Biology, University of British Columbia, Vancouver, BC V6T1Z4, Canada. 3Department of Medical Genetics, University of BritishColumbia, Vancouver, BC V6T 1Z3, Canada.Received: 25 October 2016 Accepted: 22 June 2017References1. Engler MJ, Richardson CC. DNA ligases. The Enzymes (Boyer PD, ed.),Academic Press, Inc, New York. 1982;3–29.2. Mills JD, Kawahara Y, Janitz M. Strand-specific RNA-Seq provides greaterresolution of Transcriptome profiling. Current Genomics. 2013;14(3):173–81.3. Sigurgeirsson B, Emanuelsson O, Lundeberg J. Analysis of strandedinformation using an automated procedure for strand specific RNAsequencing. BMC Genomics. 2014;15:631.4. Zhao S, Zhang Y, Gordon W, Quan J, Xi H. Du S, von Schack D and Zhang B.BMC Genomics. 2015;16:675.5. Parkhomchuk D, Parkhomchuk M, Borodina T, Amstislavskiy V, Banaru M,Hallen L, et al. Transcriptome analysis by strand-specific sequencing ofcomplementary DNA. Nucleic Acids Res. 2009;37:e123.6. Levin JZ, Yassour M, Adiconis X, Nusbaum C, Thompson DA, Friedman N, etal. Comprehensive comparative analysis of strand-specific RNA sequencingmethods. Nat Methods. 2010;7(9):709–15.7. Hawkins TL, O'Connor-Morin T, Roy A, Santillan C. DNA purification andisolation using a solid-phase. Nucleic Acids Res. 1994;22(21):4543–4.8. DeAngelis MM, Wang DG, Hawkins TL. Solid-phase reversible immobilizationfor the isolation of PCR products. Nucleic Acids Res. 1995;23(22):4742–3.9. Borgstrom E, Lundin S, Lundeberg J. Large scale library generation for highthroughput sequencing. PLoS One. 2011;6:e19119.10. Lundin S, Stranneheim H, Pettersson E, Klevebring D, Lundeberg J.Increased throughput by parallelization of library preparation for massivesequencing. PLoS One. 2010;5:e10029.11. Lucas MC, Jacobson JW, Giles NH. Characterization and in vitro translationof polyadenylated messenger ribonucleic acid from Neurospora crassa. JBacteriol. 1977;130:1192–8.12. Sturani E, Costantini MG, Martegani E, Alberghina L. Level and turnover ofpolyadenylate-containing ribonucleic acid in Neurospora crassa in differentsteady states of growth. Eur J Biochem. 1979;99:1–7.13. Kuai L, Fang F, Butler JS, Sherman F. Polyadenylation of rRNA inSaccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2004;101:8581–6.14. Aviv H, Leder P. Purification of biologically active globin messenger RNA bychromatography on oligothymidylic acid–cellulose. Proc Natl Acad Sci U SA. 1972;69:1408–12.15. Benes V, Blake J, Doyle K. Ribo-zero gold kit: improved RNA-seq results afterremoval of cytoplasmic and mitochondrial ribosomal RNA. Nat Methods.2011;8:11.16. Adiconis X, et al. Comparative analysis of RNA sequencing methods fordegraded or low-input samples. Nat Methods. 2013;10:623–9.17. Morlan JD, et al. Selective depletion of rRNA enables whole transcriptomeprofiling of archival fixed tissue. PLoS One. 2012;77:e42882.18. External RNA. Controls Consortium. Proposed methods for testing andselecting the ERCC external RNA controls. BMC Genomics. 2005;6:150.19. External RNA Controls Consortium. The external RNA controls Consortium: aprogress report. Nat Methods. 2005;2:731–34.20. Morin DR, Bainbridge M, Fejes A, Hirst M, Krzywinski M, Pugh JT, et al.Profiling the HeLa S3 transcriptome using randomly primed cDNA andmassively parallel short-read sequencing. BioTechniques. 2008;45(1):81–94.21. Anders S, Huber W. Differential expression analysis for sequence count data.Genome Biol. 2010;11:R106.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services •  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submitSubmit your next manuscript to BioMed Central and we will help you at every step:Haile et al. BMC Genomics  (2017) 18:515 Page 14 of 14


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items