« Home « Kết quả tìm kiếm

BaRTv1.0: An improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq


Tóm tắt Xem thử

- BaRTv1.0: an improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq.
- Results: A high-quality, non-redundant barley gene RTD and database (Barley Reference Transcripts – BaRTv1.0) has been generated.
- BaRTv1.0, was constructed from a range of tissues, cultivars and abiotic treatments and transcripts assembled and aligned to the barley cv.
- BaRTv1.0- Quantification of Alternatively Spliced Isoforms (QUASI) was also made to overcome inaccurate quantification due to variation in 5 ′ and 3 ′ UTR ends of transcripts.
- BaRTv1.0-QUASI was used for accurate transcript quantification of RNA-seq data of five barley organs/tissues.
- Precise transcript quantification using BaRTv1.0 allows routine analysis of gene expression and AS..
- Here, we describe the development of a first barley ref- erence transcript dataset and database (Barley Reference Transcripts – BaRTv1.0) consisting of 60,444 genes and 177,240 non-redundant transcripts.
- To create BaRTv1.0, we used 11 different RNA-seq experimental datasets repre- senting 808 samples and 19.3 billion reads that were de- rived from a range of tissues, cultivars and treatments.
- We further compared the BaRTv1.0 transcripts to 22,651 Haruna nijo full-length (fl) cDNAs [37] to assess the com- pleteness and representation of the reference transcript dataset.
- As in Arabidopsis, we also generated a version of the RTD specifically for quantification of alternatively spliced isoforms (BaRTv1.0-QUASI) for accurate expres- sion and AS analysis, which overcomes inaccurate quantifi- cation due to variation in the 5′ and 3′ UTR [53, 61]..
- Finally, we used BaRTv1.0-QUASI to explore RNA-seq data derived from five diverse barley organs/tissues identi- fying 20,972 differentially expressed genes and 2791 differ- entially alternatively spliced genes amongst the samples..
- The raw RNA-seq data of all samples were quality controlled,.
- At each stage the spliced proportions from HR RT- PCR were compared to the spliced proportions of the same AS event(s) derived from the Transcripts Per Million.
- 1 BaRTv1.0 assembly and validation pipeline.
- Steps in construction and validation of BaRTv1.0 and programs used in each step (right hand side).
- b the number of HR RT-PCR products that match transcripts.
- c correlation of the proportions of transcripts in 86 AS events derived from HR RT-PCR and the RNA-seq data using the different assemblies as reference for transcript quantification by Salmon.
- Previous studies in Arabidopsis and human RNA-seq analysis showed that variation in the 5′ and 3′ ends of assembled transcript isoforms of the same gene affected accuracy of transcript quantification.
- and 3′ ends of the longest gene transcript [61, 63].
- We similarly modified BaRTv1.0 to produce transcripts of each gene with the same 5′ and 3′ ends to generate BaRTv1.0- QUASI specifically for transcript and AS quantification..
- BaRTv1.0 represents an improved barley transcript dataset The barley cv.
- The quality con- trol filters used in the construction of BaRTv1.0 aimed to reduce the number of transcript fragments and redun- dancy as these negatively impact the accuracy of transcript quantification [61].
- The BaRTv1.0 and HORVU datasets were directly compared with the numbers of complete Haruna nijo fl cDNAs and correlating the proportions of AS transcript variants measured by HR RT-PCR with those derived from the RNA-seq analysis (Additional file 1: Table S4).
- The BaRTv1.0 transcript dataset identified more of the experimentally determined HR RT-PCR prod- ucts (220 versus 191) and has higher Pearson and Spear- man correlation co-efficient (r) with quantification of the.
- For the AS events detected in BaRTv1.0 and HORVU, we plotted the percentage spliced in (PSI) values (the frac- tion of mRNAs that represent the isoform that includes most exon sequence.
- Pearson and Spearman ranked correlation (r) of the AS proportion values showed an improvement when comparing the HR RT-PCR with the three RNA-seq reference transcript datasets, HORVU (0.769 and 0.768), BaRTv1.0 (0.793 and 0.795) and BaRTv1.0-QUASI 0.828 and 0.83) (Table 1.
- We conclude that BaRTv1.0 (and the derived BaRTv1.0-QUASI) RTD is a comprehensive, non- redundant dataset suitable for differential gene expression and AS analyses..
- BaRTv1.0 genes and transcripts.
- We next explored the characteristics of BaRTv1.0 genes and transcripts.
- A total of 57% of the BaRTv1.0 genes.
- Analysis of the 177,240 predicted transcripts in BaRTv1.0 showed the expected distribution of canonical splice site dinucleotides.
- Frequencies of the different AS events were consistent with studies in other plant Table 1 Transcriptome dataset comparisons with HR RT-PCR and Haruna nijo fl cDNAs.
- Transcriptome Version BaRTv1.0 BaRTv1.0-QUASI HORVU.
- HR RT-PCR products .
- 3 Correlation of alternative splicing from HR RT-PCR and RNA-seq.
- fluorescence units from HR RT-PCR and transcript abundances (TPM) from RNA-seq data quantified with Salmon using the (a) BaRTv1.0, b HORVU and (c) BaRTv1.0-QUASI transcript datasets as reference.
- Of the alternative 3′.
- We used RNA-seq data from three bio- logical repeats of five organs/tissues of Morex to quan- tify transcripts with Salmon and BaRTv1.0-QUASI..
- transcripts of the gene [10].
- Validation of differential AS from RNA-seq with HR RT-PCR and RNA-seq.
- To validate differential AS observed for individual genes among the different organs/tissues, we compared the RNA-seq quantifications of the 86 AS genes and 220 transcripts used in HR-RT-PCR.
- Each of these examples show the pattern of AS across the tissues are essentially equivalent between HR RT-PCR and RNA-seq (Fig.
- Thus, there is good agreement between the differential alternative spli- cing analysis from the RNA-seq data and the experimental verification with HR RT-PCR.
- These data provide strong support for the value of using BaRTv1.0 and BaRTv1.0-.
- A principal aim of establishing BaRTv1.0 was to achieve higher accuracy of differential expression and AS ana- lysis in barley RNA-seq datasets by improved transcript quantification.
- 344 k) was approxi- mately halved in BaRTv1.0 (ca.
- BART1_0-u51812 contains 44 different transcript iso- forms in the BaRTv1.0 dataset due to unique combina- tions of different AS events (Fig.
- 5 Comparison of alternative splicing in different barley tissues with HR RT-PCR and RNA-seq data.
- splice sites and two alternative exons from the BaRTv1.0 transcripts (Fig.
- These AS events were also quantified using transcript abundances from the RNA-seq data using BaRTv1.0_QUASI and showed good agreement with the HR RT-PCR results with Pearson correlations of 0.92 for the Hv78 regions and 0.73 for the Hv79 re- gion.
- These examples support the accuracy of alternative splicing found in BaRTv1.0 and that the proportions of alternative splice sites selected in short-read RNA-seq can be determined..
- Here we describe the BaRTv1.0 transcript dataset or transcrip- tome for barley, produced by merging and filtering tran- scripts assembled from extensive RNA-seq data and its utility in differential expression and differential alternative splicing.
- Finally, the BaRTv1.0 transcript dataset will enable accur- ate gene and transcript level expression and AS analysis increasing our understanding of the full impact of AS and how transcriptional and AS regulation of expression.
- How- ever, the arrangement of BaRTv1.0 transcripts have identified mis-annotated chimeric genes in the barley reference genome, helping to improve gene resolution..
- BaRTv1.0 was established using RNA-seq data contain- ing approximately 19 billion reads from a range of differ- ent biological samples (organs, tissues, treatments and genotypes) and was assembled initially against the Morex genome.
- A key function of the BaRTv1.0 transcript dataset is improved accuracy of transcript abundance.
- We also found an improvement in the quantification of transcripts and splicing proportions by applying the same approach to produce the BaRTv1.0-QUASI version, spe- cifically for quantification of alternatively spliced isoforms (Table 1).
- To demonstrate the value of the new RTD for gene expression studies and AS analysis, we used BaRTv1.0- QUASI to quantify transcripts in the five developmental organs and tissues RNA-seq datasets that we had used previously for HR RT-PCR optimisation and validation..
- BART1_0-u51812 transcript models represented in the BaRTv1.0 database.
- AS events involving intron 2 validated by HR-RT-PCR.
- AS events between exon 6 and 8 validated by HR-RT-PCR.
- Electropherogram output from the ABI3730 shows the HR RT-PCR products (x-axis RT-PCR products (bp).
- indicates minor alternative transcripts identified in HR RT-PCR and in RNA-seq.
- indicates an uncharacterised alternative transcript identified in HR RT-PCR.
- BaRTv1.0 en- ables rapid and robust analysis of gene expression and AS in a wide range of experimental scenarios.
- BaRTv1.0 is based on cv.
- Morex but used RNA-seq data from a wide-range of cultivars and lines.
- A comprehensive, non-redundant barley reference tran- script dataset called BaRTv1.0 has been generated, which enables fast, precise transcript abundances.
- BaRTv1.0 is part of a unique pipeline that facilitates the robust routine analysis of barley gene expression and AS.
- Selected RNA-seq datasets and data processing.
- to each RNA-seq tran- scriptome assembly generated.
- High resolution RT-PCR.
- Morex was used for HR RT-PCR validation [35].
- Comparing HR RT-PCR and RNA-seq alternative splicing proportions.
- To assess the accuracy of BaRTv1.0 to detect changes in AS in the RNA-seq data, we compared the splicing pro- portions for AS events from HR RT-PCR with those cal- culated from the RNA-seq data using the HORVU transcript set, BaRTv1.0 and BaRTv1.0-QUASI as tran- script references.
- For this reason, multiple RNA- seq transcripts may represent the same AS product that is detected by HR RT-PCR.
- The proportions of the different AS products for both HR-RT-PCR and RNA-seq were then subse- quently calculated and correlated..
- PCR and RNA-seq were identified.
- Finally, based on the calculated values of RNA-seq levels of expression and the calculated values of HR RT-PCR for each RT-PCR prod- uct, the proportions of the alternative transcripts were cal- culated.
- Generation of the BaRTv1.0 database.
- Statistical analysis HR RT-PCR ANOVA.
- Mean proportions of alternatively spliced products by HR-RT-PCR analyisis.
- Correlation of HR RT-PCR data with BaRTv1.0, BaRTv1.0- QUASI and HORVU transcripts.
- Pipeline de- scribing the algorithm to compare HR-RT-PCR and RNA-seq alternatively spliced transcript proportions and correlations..
- HR RT-PCR: High resolution RT- PCR.
- RNA-seq: RNA-sequencing.
- PR-F and MB assembled the RNA- seq data.
- JF, GS and CS identified the AS genes and performed the HR RT- PCR screening and analysis of the data.
- PR-F, C-DM, JWSB, RZ, WG and CS performed the detailed analysis of the RNA-seq and HR RT-PCR data.
- BaRTv1.0 and BaRTv1.0 – QUASI are available as .fasta and.
- To develop BaRTv1.0 we used publicly available sequences from the Sequence Read Archive (SRA) or European Nucleotide Archive (ENA) (accession numbers: PRJEB13621.
- Near-optimal probabilistic RNA- seq quantification.
- Optimizing RNA-Seq mapping with STAR.
- STAR: ultrafast universal RNA-seq aligner..
- Systematic evaluation of spliced alignment programs for RNA-seq data.
- A physical, genetic and functional sequence assembly of the barley genome.
- Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis.
- A chromosome conformation capture ordered sequence of the barley genome.
- Expansion of the eukaryotic proteome by alternative splicing.
- Complexity of the alternative splicing landscape in plants

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt