« Home « Kết quả tìm kiếm

Plant stress RNA-seq Nexus: A stress-specific transcriptome database in plant cells


Tóm tắt Xem thử

- Plant stress RNA-seq Nexus: a stress-specific transcriptome database in plant cells.
- RNA sequencing (RNA-Seq) is a revolutionary tool that has been used extensively in plant stress research.
- However, no existing large-scale RNA-Seq database has been designed to provide information on the stress-specific differentially expressed transcripts that occur across diverse plant species and various stresses..
- Results: We have constructed a comprehensive database, the plant stress RNA-Seq nexus (PSRN), which includes 12 plant species, 26 plant-stress RNA-Seq datasets, and 937 samples.
- All samples are assigned to 133 stress-specific subsets, which are constructed into 254 subset pairs, a comparison between selected two subsets, for stress-specific differentially expressed transcript identification..
- Conclusions: PSRN is an open resource for intuitive data exploration, providing expression profiles of coding- transcript/lncRNA and identifying which transcripts are differentially expressed between different stress-specific subsets, in order to support researchers generating new biological insights and hypotheses in molecular breeding or evolution.
- RNA-Seq has been used extensively in plant research.
- Recently, global transcrip- tome profiling analysis using RNA-Seq has been reported to identify differentially expressed lncRNAs, coding genes and alternatively spliced isoforms in response to environmental stresses, such as salt, heat, cold, drought, light, ozone, excessive boron, and patho- gen infection [12–14].
- Full list of author information is available at the end of the article.
- 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated..
- Therefore, a large-scale stress-specific RNA-Seq database that can provide comprehensively vi- sualized transcriptome expression profiles and statistical analysis for differential expression has not been reported for plants.
- In the postgenomic era, RNA-Seq provides a global transcriptome profile, which could cover lncRNAs, coding genes and their alternatively spliced isoforms in response, and helps plant biologists to expand new insights into molecular mechanisms and re- sponses to biotic and abiotic events.
- Thus, we developed an extensive genome-wide plant stress RNA-Seq database..
- In this study, we constructed the first large-scale plant stress RNA-Seq database, the plant stress RNA-Seq nexus (PSRN), that achieves the following unprecedented fea- tures: (i) large-scale and comprehensive data archives and analyses, including coding-transcript profiling and lncRNA profiling, (ii) phenotype-oriented data organization and searching, and (iii) the visualization of expression profiles, as well as differential expression.
- PSRN was developed with the goal of collecting, processing, analyzing and visualizing publicly available plant RNA-Seq data.
- It resulted in 12 plant species and 26 plant-stress RNA-Seq datasets, including 133 stress-specific subsets and 937 samples (Additional file 1: Table S1).
- Each subset is a group of plant RNA-Seq samples associated with a specific stress phenotype or genotype.
- In addition, PSRN provides a user-friendly interface to efficiently organize and visualize the expression profiles of the differential expressed transcripts for any pair of stress-specific subsets (Additional file 2: Table S2).
- Plant stress RNA-Seq dataset collection.
- The annotation for the RNA-Seq datasets regarding plant stress was collected from NCBI GEO [21], and RNA-Seq reads were downloaded from the SRA [22]..
- Each dataset has several stress-specific subsets that con- tain a group of RNA-Seq samples of plants treated with.
- We only retained the datasets that had a reference transcriptome for subsequent RNA-Seq analysis.
- Stress-specific differentially expressed transcripts.
- To estimate the expression profile of each sample, Bowtie [31], version 1.1.2, was used to build the indexer of the reference sequence and align RNA-Seq reads to the reference transcriptome with the indexer.
- Samples with at least 1 million raw reads and over 15% of reads mapped to the reference transcriptome were retained for subsequent analysis to ensure enough depth of the se- quencing coverage [32].
- After alignment, transcript abundances of each sample were estimated using eXpress [33], version 1.5.1, with the expression quantifi- cation unit ‘fragments per kilobase of transcript per million mapped reads (FPKM) [34].
- To identify stress-specific, differentially expressed, transcripts in each dataset, we selected the subsets that included at least three samples to fulfill the significance test criteria, and then converted the expression FPKM value to a log 2.
- This resulted in 259 stress-specific subset pairs with differentially expressed transcripts (P-value <.
- Protein-coding RNA-lncRNA coexpression networks.
- To investigate the potential stress-specific biological functions of lncRNAs in plant cells, we constructed protein-coding RNA and lncRNA coexpression networks from Arabidopsis thaliana and Oryza sativa data.
- Cytos- cape [36] was used to demonstrate the protein-coding RNA-lncRNA coexpression networks in the web interface of PSRN..
- The PSRN database includes 12 plant species: Arabidop- sis thaliana, Chlamydomonas reinhardtii, Glycine max, Manihot esculenta, Oryza sativa indica, Oryza sativa Japonica, Panicum virgatum, Populus tremuloides, Solanum lycopersicum, Sorghum bicolor, Triticum aestivum, and Vitis vinifera, which contain 26 RNA-Seq datasets and 937 samples.
- All samples were classified and assigned to 133 stress-specific subsets, which were constructed into 254 subset pairs to describe.
- stress-specific differentially expressed (DE) transcripts from a systematic RNA-Seq analysis.
- Considering the variants exiting in different analyses, we just compared the expression profile in the same database.
- According to the information of individual papers, we used transcript ID or gene name to find the corresponding expression pro- file.
- PSRN provides a user-friendly web interface that integrates large-scale stress-specific RNA-Seq datasets of plants.
- Here, we describe the analysis unit, which is the main function unit of the PSRN database.
- 2a, the analysis function provides a tree structure in the species and subset panel, which facilitates searching and.
- 1 The framework of the database construction in PSRN.
- The plant stress RNA-Seq datasets were collected from NCBI GEO and SRA, and then all samples were classified into stress-specific subsets for each dataset.
- In the RNA-Seq data processing, Bowtie2 and eXpress were used to calculate transcript expressions of each RNA-Seq dataset with references collected from Phytozome, Ensembl Plants, and PopGenIE.
- Finally, we calculated the log2 scale T-test and FDR between two subsets belonging to the same dataset and then constructed the user interface for PSRN.
- browsing stress-specific subsets.
- When users select a subset of interest, the associated subset pairs are subse- quently listed in the subset-pair panel.
- When users select a subset pair, the web server shows the detailed informa- tion of the DE transcripts of the subset pair, as well as the detailed description of the dataset and subsets, into the right main panel.
- In the right main panel, there are three subpanel tabs as follows:.
- The interface displays the rank, the P-value, the FDR, the average expression values in the.
- Whenever user clicks transcript ID, the PSRN will show details of its expression in all subsets across all datasets belonging to the same species.
- Additionally, the search function on the expression profiles allows users to investigate all transcripts associated with the given transcript ID, KEGG Orthology number/Name, or RefSeq annotation, and an autocomplete function provides suggestions for search field as the user types, quickly searching and displaying partially matched.
- 2 Screenshots of the web interface of PSRN.
- (3) Expression profile panel: subset-pair information and differentially expressed protein-coding transcripts are shown in the panel and sorted by significance level.
- When users use Search in the expression panel, they can input a transcript ID, KEGG Orthology/name, or RefSeq ID into the autocomplete field that allows for quickly searching and selecting the partially matched terms..
- At last, PSRN generates the expression profiles of all isoforms in the search results.
- If a user clicks “ DE lncRNAs ” in Arabidopsis thaliana and Oryza sativa , differentially expressed lncRNA transcripts are replaced with protein-coding isoforms.
- sativa presents the regulatory network according to the correlation of expression between lncRNAs and protein-coding transcripts.
- The search function also allows users to inves- tigate the expression profiles of multiple genes at one time.
- All functions of this panel are similar to the DE coding transcript panel but visualize the expression profiles of DE lncRNAs sorted by the P-value..
- (iii) Protein-coding RNA-lncRNA coexpression network:.
- Similar to the DE lncRNA panel, this panel is only constructed for Arabidopsis thaliana and Oryza sativa groups..
- (i) In the analysis results for the GSE54680 dataset, the top 5 significant upregulated transcripts in 10 °C subset are AT3G50970.1 (Low Temperature- Induced 30, LTI30, XERO2), AT4G14690.1 (Early Light-Inducible Protein 2, ELIP2), AT1G09350.1 (Galactinol Synthase 3, GOLS3, ATGOLS3), AT5G52310.1 (Low- Temperature-Induced 78, LTI78, COR78), and AT1G20440.1 (Cold-Regulated 47, COR47, ATCOR47) (illustrated in Fig.
- GOLS catalyzes the first step in the biosynthesis of RFO.
- 3 The expression of cold-related transcripts in Arabidopsis thaliana treated with different temperature conditions.
- The top 5 significantly upregulated transcripts in the 10 °C subset are AT3G50970.1, AT4G14690.1, AT1G09350.1, AT5G52310.1, and AT1G20440.1 ( P -value = 4.41E-78 ~ 3.09E-57.
- According to the TAIR database, AT3G50970.1, AT5G52310.1, and AT1G20440.1 are Low Temperature-Induced 30 (LTI30, XERO2), Low-Temperature-Induced 78 (LTI78, COR78), and Cold-Regulated 47 (COR47, ATCOR47), respectively.
- Limited by the layout width, only parts of the expression profiles are shown.
- 4 The expression of HAB1 isoforms of the seedlings treated with ABA in Arabidopsis thaliana .
- The group A PP2C (protein phosphatases 2C) HAB1 gene has three transcript isoforms: HAB1.1 (AT1G72770.1), HAB1.2 (AT1G72770.2), and HAB1.3 (AT1G72770.3).
- Based on the P -value calculated by PSRN, the HAB1 isoform AT1G72770.1 is significantly downregulated ( P -value = 9.33E-03) in the rbm25 – 1 mutation seedlings when compared with the wild type, while AT1G72770.2 is significantly upregulated ( P -value = 2.01E-3) in rbm25 – 1 mutant seedlings.
- used the HAB1 gene, a group A protein phosphatase 2C (PP2C), to demonstrate the biological importance of the level of expression of isoforms provided in PSRN (illustrated in Fig.
- In Arabidopsis, the HAB1 gene has three alternatively spliced isoforms: HAB1.1 (AT1G72770.1), HAB1.2 (AT1G72770.2), and HAB1.3 (AT1G72770.3)..
- HAB1.1 interacts with the Open Stomata 1 (OST1), inhibiting its kinase activity, which switches the ABA signaling off.
- In contrast, HAB1.2 encodes a nonfunctional truncated protein, thereby keeping the ABA signaling on.
- Thus, accurate regulation of the HAB1.1 to HAB1.2 ratio is important for the fine-tuning of ABA signaling and plant adaptation to stress.
- A loss-of-function mutation in RBM25, rbm25–1, resulted in an increase in the HAB1.2:HAB1.1 ratio and ABA-hypersensitive phenotypes [44].
- 4, HAB1.1 (At1g72770.1) is significantly downregulated while HAB1.2 (At1g72770.2) is significantly upregulated in rbm25 – 1 mutation seedlings, which is consistent with previous reports..
- Despite numerous RNA-Seq datasets being collected at the beginning of the study, only a minority of them were retained for PSRN construction.
- Most of the discarded datasets were excluded because of the insufficient number of samples suitable for the criteria of subset creation, and the rest were excluded because of a lack of a reference se- quence.
- To solve the latter problem, we will continue to collect references, annotations, and RNA-Seq datasets to expand the PSRN database and keep it up to date.
- The next updated version will include: (1) the new plant stress RNA-Seq datasets and the reference transcriptomes col- lection, data curation, and KEGG and Refseq annotations, (2) performing the expression profile analysis pipeline of RNA-Seq datasets.
- The DESeq will be performed in this update, and the results of both t-test and DESeq will be reported in the database..
- RNA-Seq: RNA sequencing.
- MOST B-005-005 and MOST B-005-035-MY2), and in part by the Advanced Plant Biotechnology Center from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan..
- JRL also collected RNA-Seq datasets and reference transcriptomes and created the work-flow.
- CHS performed bioinformatics pipelines and contributed to the database and website construction.
- from genes to the field.
- Differential SAGE analysis in Arabidopsis uncovers increased transcriptome complexity in response to low temperature..
- RNA-Seq: a revolutionary tool for transcriptomics.
- Characterization of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural features.
- STIFDB2: an updated version of plant stress-responsive transcription factor database with additional stress signals, stress-responsive transcription factor binding sites and stress-responsive genes in Arabidopsis and rice.
- PSPDB: plant stress protein database.
- Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.
- Single-cell RNA-seq reveals dynamic paracrine control of cellular variation.
- Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
- Characterization and differential expression of Dhn/lea/Rab-like genes during cold-acclimation and drought stress in Arabidopsis-Thaliana.
- Overexpression of multiple dehydrin genes enhances tolerance to freezing stress in Arabidopsis.
- Important roles of drought- and cold-inducible genes for galactinol synthase in stress tolerance in Arabidopsis thaliana..
- Differential expression of two related, low- temperature-induced genes in Arabidopsis thaliana (L.) Heynh.

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt