« Home « Kết quả tìm kiếm

Systematic evaluation of RNA-Seq preparation protocol performance


Tóm tắt Xem thử

- Systematic evaluation of RNA-Seq preparation protocol performance.
- We used three standard input protocols: Illumina TruSeq Stranded Total RNA and mRNA kits, a modified NuGEN Ovation v2 kit, and the TaKaRa SMARTer Ultra Low RNA Kit v3.
- Conclusions: At the manufacturers ’ recommended input RNA levels, all the RNA-Seq library preparation protocols evaluated were suitable for distinguishing between experimental groups, and the TruSeq Stranded mRNA kit was universally applicable to studies focusing on protein-coding gene profiles.
- The TruSeq protocols tended to capture genes with higher expression and GC content, whereas the modified NuGEN protocol tended to capture longer genes.
- The SMARTer Ultra Low RNA Kit may be a good choice at the low RNA input level, although it was inferior to the TruSeq mRNA kit at standard input level in terms of rRNA removal, exonic mapping rates and recovered DEGs.
- Therefore, the choice of RNA-Seq library preparation kit can profoundly affect data outcomes.
- RNA-Seq is used primarily to identify differentially expressed genes (DEGs) in different biological condi- tions, but it is also used to discover non-coding RNAs such as microRNAs and long non-coding RNAs (lncRNAs) [7].
- The first goal of our study was to investigate confounding factors in RNA-Seq library preparation pro- tocols using three standard input kits: the TruSeq Stranded Total RNA and mRNA Library Prep Kits from Illumina, and a modified NuGEN Ovation® RNA-Seq System.
- Defining the properties of the data generated using these protocols may aid users in designing their future RNA-Seq strategies.
- Our results demonstrated that the TruSeq Stranded mRNA protocol was the best for transcriptome profiling and that the TruSeq Stranded Total RNA and mRNA protocols were comparable, whereas the modified NuGEN protocol performed less well for whole tran- scriptome analysis, but might be a better choice for studies focused on non-coding RNAs.
- Lastly, although the results obtained with the SMARTer Ultra Low RNA Kit were comparable to those of the TruSeq Stranded mRNA kit for most metrics and for identification of DEGs, the absolute expression levels were only moder- ately correlated.
- We used the manufacturer-recommended optimal in- put amounts (1 μg for both the Illumina TruSeq Stranded Total RNA and the Illumina TruSeq Stranded mRNA protocols.
- 1a, the Illumina TruSeq Stranded Total RNA proto- col uses Ribo-Zero to remove rRNA, whereas the Tru- Seq Stranded mRNA protocol enriches mRNA through polyA selection [11].
- To check whether this modified ultra-low input protocol was capable of gener- ating quality data, we compared the mESC dataset de- rived from the TaKaRa SMARTer cDNA synthesis step combined with Nextera library preparation, to the high- quality datasets obtained using the TruSeq Stranded mRNA protocol with 2 μ g total RNA as the input level..
- The data analysis flow and the data quality metrics used in this study to evaluate RNA-Seq protocols are diagrammed in Fig.
- We found that for the libraries created with the modified NuGEN, TruSeq Stranded Total RNA, and Tru- Seq Stranded mRNA protocols.
- 93%, for both TruSeq Stranded Total RNA and TruSeq Stranded mRNA libraries, and ~ 60% for the modified NuGEN library (Fig.
- 3–5%, and ~ 2% of total non-rRNA fragments from the samples prepared with the TruSeq Stranded Total RNA, TruSeq Stranded mRNA, and modified NuGEN protocols, respectively (Fig.
- 1 Experimental design and RNA-Seq data quality metrics.
- a Flow chart outlining the experimental design for comparing the three standard input RNA-Seq library preparation protocols.
- 10 ng RNA, 10 PCR cycles) were used to make cDNA for the TaKaRa SMARTer Low Input RNA-Seq kit v3 protocol.
- c A diagram depicting the data analysis flow and the data quality metrics used in this study to evaluate RNA-Seq protocols.
- Reads from the TruSeq Stranded Total RNA and TruSeq Stranded mRNA protocols were more evenly distributed along the entire length of the transcript (Fig.
- Closer examination of each nucleotide within 1000 bps of the 5′- and 3′- ends confirmed that the modified NuGEN protocol failed to capture the RNA signal towards the 5 ′ -end (Additional file 2: Figure S2A, C), and also suggested that the TruSeq Stranded mRNA protocol missed the signal within 200 bp of the 3 ′ -end, compared to the TruSeq Stranded Total RNA protocol (Additional file 2:.
- Representation of the transcriptome (standard input protocols).
- We found that for the TruSeq Stranded Total RNA and mRNA protocols, respectively, approximately 67–84% and 88–91% of the fragments were from exonic regions.
- For the modified NuGEN protocol, only 35–45% of the fragments were from exonic regions.
- Since only the TruSeq protocols are strand-specific, as.
- For coding genes, the saturation curves from the TruSeq Stranded Total RNA and mRNA libraries looked very similar and were superior to those from the NuGEN li- braries (Fig.
- For lncRNAs, the modified NuGEN protocol outperformed both the TruSeq Stranded Total RNA and mRNA proto- cols, yielding more lncRNAs at the same sequencing depth (Fig.
- We recovered the lowest num- ber of splice junctions using the modified NuGEN protocol and the highest number with the TruSeq Stranded mRNA protocol..
- The cor- relation coefficients between samples prepared using different protocols were lower between the TruSeq Stranded Total RNA and mRNA protocols between the TruSeq Stranded Total RNA and modified NuGEN protocols, and 0.77–0.82 between the TruSeq Stranded mRNA and modified NuGEN protocols (Fig.
- Unsupervised clustering demonstrated that the whole transcriptome expression profiles ob- tained from TruSeq Stranded Total RNA and mRNA libraries were more similar to each other than either was to the NuGEN libraries (Fig.
- Principal component analysis (PCA) recapitulated the clustering analysis: the NuGEN libraries were separated from the TruSeq li- braries in the first component, whereas the TruSeq Stranded Total RNA and mRNA libraries were separated in the second component (Fig.
- Further investigation revealed the TruSeq protocols tended to capture genes with higher expression and GC content, whereas the modified NuGEN protocol tended to capture longer genes (Additional file 7: Figure S7B-C).
- Comparing the TruSeq mRNA protocol to the TruSeq Total RNA protocol, showed that the TruSeq mRNA protocol preferentially recovered genes with higher GC content and shorter length (Additional file 7:.
- To exclude the possibility that these differ- ences stemmed from batch effects, such as different set of libraries being prepared at different times, we in- cluded additional technical replicates, prepared at dif- ferent times, for the TruSeq Stranded Total RNA and mRNA protocols (1 μ g).
- 3 Representation of the transcriptome for all the libraries prepared with standard protocols.
- Three hundred ninety-four DEGs were detected across all three RNA-Seq library preparation protocols, accounting for 41, 38, and 28% of the total DEGs detected when using the TruSeq Stranded Total RNA, TruSeq Stranded mRNA, and modified NuGEN protocols, respectively (Fig.
- The pairwise scatter plots of log 2 ratio values between DEGs from control and experimental mouse tumor tissues showed that the TruSeq Stranded Total RNA and mRNA results were more highly correlated with each other (Spearman’s correlation coefficient = 0.99) than either was with the modified NuGEN proto- col (Spearman’s correlation coefficient = 0.80 and 0.79, respectively) (Fig.
- That is, the TruSeq Total RNA and mRNA protocols yielded more shared DEGs than either did with the modified NuGEN protocol (Fig.
- The DEGs re- covered with the TruSeq Total RNA and mRNA proto- cols had correlation coefficients of 0.78 and 0.76 vs..
- However, independent valid- ation of DEGs by qPCR indicated that the differential expression results from the TruSeq Stranded Total RNA and mRNA protocols might be more accurate than those from the modified NuGEN protocol..
- Effectively executing low-input RNA-Seq is essential to achieve these goals.
- clones (biological replicates), we evaluated its per- formance by comparing it to that of the TruSeq Stranded mRNA protocol using 2 μg of total RNA, as a.
- 10 ng RNA) levels than did the TruSeq Stranded mRNA protocol using standard in- put RNA amounts (Fig.
- The percentage of fragments with both ends mapped to the genome was 91–92% for the TruSeq Stranded mRNA protocol and 60–65% for the SMARTer protocol using either 100 or 1000 cells (Fig.
- 6% were from intronic regions, and ~ 4% were from intergenic regions, which was comparable to librar- ies from the TruSeq Stranded mRNA protocol (Fig.
- For coding genes, the saturation curves for libraries from the SMARTer protocol with 100 and 1000 cells were very similar and were slightly less ro- bust than those from the TruSeq Stranded mRNA protocol (Fig.
- The SMARTer protocol outper- formed the TruSeq Stranded mRNA protocol in recovering more lncRNAs at the same sequencing depth (Fig.
- However, at the same sequencing depth, the number of splice junctions detected in li- braries from the SMARTer protocol was lower than in libraries from the TruSeq Stranded mRNA protocol (Fig.
- Overall, low-input RNA samples subjected to the SMARTer protocol, when compared to the TruSeq Stranded mRNA protocol, produced data with greater rRNA contamination but similar rates of exon detection.
- However, the coefficients between samples prepared using the SMARTer and standard TruSeq Stranded mRNA protocols were lower Fig.
- Further investigation showed the SMARTer protocol tended to allow recovery of genes with higher expression, lower GC content, and shorter length, com- pared to the TruSeq mRNA protocol (Additional file 7:.
- There were 2623 DEGs shared between the SMARTer libraries generated from either 100 or 1000 cells and the TruSeq Stranded mRNA libraries, ac- counting for 40, 37, and 23% of the total DEGs detected in each, respectively, but the majority of DEGs recovered from the TruSeq Stranded mRNA libraries (4376 genes) were excluded from the SMARTer libraries (Fig.
- In summary, the SMARTer Ultra Low RNA Kit is capable of capturing the effect of biological conditions, but is not as robust as the standard input protocol at a normal input level of 2 μg for the TruSeq Stranded mRNA-Seq protocol..
- These protocols were the standard input Illumina TruSeq Stranded Total RNA, Illumina TruSeq Stranded mRNA, and modified NuGEN Ovation v2 kits.
- One impediment to the efficient recovery of meaningful RNA-Seq data is repetitive rRNA.
- For the three standard protocols and the one ultra-low input protocol we evalu- ated, the TruSeq Stranded Total RNA and the modified NuGEN Ovation RNA-Seq System V2 protocols employ rRNA depletion methods, whereas the TruSeq Stranded mRNA protocol and SMARTer Ultra-low protocol use polyA enrichment methods to reduce rRNA contamin- ation in sequencing libraries.
- In our present study, the modified NuGEN protocol libraries averaged 15–20% of their reads mapping to rRNA, as compared to 1–5% for the TruSeq protocols (Fig.
- We also exam- ined the rRNA mapping rate in libraries prepared from two polyA-enrichment protocols, the Illumina TruSeq Stranded mRNA protocol and the TaKaRa SMARTer Ultra Low RNA protocol.
- The SMARTer protocol yielded a 7 – 9% rRNA mapping rate, which was inferior to the TruSeq protocol at standard RNA input levels (1%) (Fig.
- The TruSeq protocols yielded a ≥ 90% overall mapping rate for fragments with both ends mapped to the gen- ome, compared to 60% for the modified NuGEN protocol (Fig.
- The TruSeq Stranded mRNA libraries were also somewhat biased, as reflected by a lack of reads within 200 bps of the 3′-end, relative to the TruSeq Total RNA libraries (Additional file 2: Figure S2B, 2D).
- This may be because of the dif- ference between the rRNA depletion approaches used by the TruSeq mRNA and TruSeq total RNA protocols, resulting in more unmappable reads near the 3 ′ -end in TruSeq mRNA libraries due to the presence of polyA tails in these reads..
- Ninety percent of our reads were mapped to exons using the TruSeq Stranded mRNA kit, 67–84% using the Total RNA kit, and 35 – 46% using the NuGEN kit (Fig.
- This is further supported by our find- ing that, compared to the three standard input proto- cols, the polyA-based TaKaRa SMARTer Ultra Low RNA Kit had almost the same exonic coverage as the TruSeq Stranded mRNA protocol (Fig.
- (after removing PCR duplicates) [13], whereas our TruSeq Stranded Total RNA libraries consisted of 14 – 28% intronic se- quences.
- In contrast, the TruSeq Stranded mRNA librar- ies contained only 6 – 8% intronic sequences (Fig.
- In this case, better lncRNA recovery may be due to differ- ences in the cDNA synthesis step rather than in the rRNA depletion step: whereas the TruSeq Stranded Total RNA protocol uses only random primers for cDNA synthesis, the modified NuGEN protocol uses a.
- For the purpose of evaluation, the above analyses also include the libraries prepared with the TruSeq Stranded mRNA protocol using the same biological conditions.
- In the current study, we identified 960 and 1028 DEGs between experimental and control tumor tissues using the TruSeq Total RNA and mRNA protocols (manuscript in preparation), re- spectively, which was slightly fewer than the 1430 DEGs identified using the modified NuGEN protocol (Fig.
- That is, the modified NuGEN protocol may have resulted in more false-positive DEGs than did the TruSeq protocols.
- The comparable performance of the TruSeq Total and mRNA protocols in our study contrasts with the results of Zhao, et al., who directly compared the TruSeq Stranded Total and mRNA pro- tocols using clinical samples.
- They found the TruSeq Stranded mRNA libraries more accurately predicted gene expression levels than the TruSeq Stranded Total RNA libraries [11]..
- First, we found that the TruSeq Stranded mRNA protocol is universally applic- able to studies focusing on dissecting protein-coding gene profiles when the amount of input RNA is suffi- cient, whereas the modified NuGEN protocol might pro- vide more information in studies designed to understand lncRNA profiles.
- Therefore, choosing the appropriate RNA-Seq library preparation protocol for recovering specific classes of RNA should be a part of the overall study design [18].
- For the purpose of evaluation, the libraries prepared from the same biological conditions with the TruSeq Stranded mRNA protocol are also included.
- c Venn diagram showing the number of DEGs recovered with the SMARTer Ultra Low RNA (100 cells and 1000 cells) and the TruSeq Stranded mRNA kits.
- For the three standard input RNA-Seq library preparation proto- cols (Illumina TruSeq Stranded Total RNA, TruSeq Stranded mRNA kit, and the modified NuGEN Ovation RNA-Seq kits), total RNA was isolated from three xeno- graft tumors (biological replicates) from control [30% cal- orie restricted diet [19]] and experimental [(diet-induced obese (OB)) xenograft mouse models in the C57BL/6 gen- etic background, respectively.
- TruSeq stranded total RNA and mRNA library preparations.
- Libraries were prepared using the Illumina TruSeq Stranded Total RNA (Cat.
- NuGEN ovation RNA-Seq system v2 modified with SPRI-TE library construction system.
- Total RNA (100 ng) was converted to cDNA using the NuGEN Ovation RNA-Seq System v2 (Cat.
- RNA-Seq data analysis Mapping.
- (A and C) and 3 ′ -end (B and D) of the transcripts.
- The TruSeq Total RNA and mRNA libraries shown in A and B were prepared from 1 μ g RNA and in C and D were prepared from 100 ng RNA.
- Concordance of expression quantification using standard protocols with additional technical replicates prepared by the TruSeq Stranded Total RNA and mRNA protocols.
- Blue, green and red dots represent libraries prepared using the TruSeq Stranded Total RNA, TruSeq Stranded mRNA, and NuGen protocols, respectively.
- The modified NuGEN protocol is not included for the comparison, because one of the libraries prepared with the TruSeq Total RNA protocol (100 ng) and the TruSeq mRNA protocol (100 ng) used a different xenograft tumor from a different mouse.
- the TruSeq Total RNA protocol.
- Panel B shows the TruSeq mRNA protocol vs.
- Panel E shows the TruSeq mRNA protocol vs.
- Panel F shows the TruSeq mRNA protocol vs.
- RNA-Seq: Ribonucleic acid sequencing;.
- Mapping and quantifying mammalian transcriptomes by RNA-Seq.
- A comprehensive assessment of RNA-seq protocols for degraded and low- quantity samples.
- Ribosomal RNA depletion for efficient use of RNA-seq capacity.
- TopHat: discovering splice junctions with RNA-Seq.
- RSeQC: quality control of RNA-seq experiments.

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt