« Home « Kết quả tìm kiếm

Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes


Tóm tắt Xem thử

- Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes.
- Strong conservation across multiple species and non-random accumulation of substitutions in splice sites indicate a functional relevance of non-canonical splice sites.
- Non-canonical splice sites were first identified before genome sequences became available on a massive scale (reviewed in [29.
- GC-AG and AT-AC are classified as major non-canonical splice site combinations, while all deviations from these sequences are deemed to be minor non-canonical splice sites.
- Dedicated split-read aligners like STAR [31, 32] are able to detect non-canonical splice sites during the alignment of RNA-Seq reads to genomic sequences.
- Nonetheless, the combined number of currently inferred minor non-canonical splice site.
- combinations is even higher than the number of the major non-canonical AT-AC splice site combinations [30, 34]..
- We incor- porated RNA-Seq data to differentiate between artifacts and bona fide cases of active non-canonical splice sites..
- We then identified homologous non-canonical splice sites across species and subjected the genes containing these splice sites to phylogenetic analyses.
- Classification of annotated splice sites.
- A more detailed classification into major non-canonical splice site combinations (GC-AG, AT-AC) and all remaining minor non-canonical splice site combina- tions was applied.
- As proof of concept, one previously validated non-canonical splice site containing gene [30], At1g79350 (rna15125), was investigated in more depth..
- Validation of annotated splice sites.
- Comparison of non-canonical splice sites to overall sequence variation.
- rates in a species were compared against the observed substitution in minor non-canonical splice sites via Chi 2 test..
- Genomic properties of plants and diversity of non- canonical splice sites.
- Our investigation of these 121 plant genome sequences revealed a huge variety of different non-canonical splice site combinations (Additional files 6 and 7).
- Camelina sativa dis- played the highest number of minor non-canonical splice.
- There is a strong correlation be- tween the number of non-canonical splice site combina- tions and the total number of splice sites (Spearman correlation coefficient = 0.53, p-value .
- Non-canonical splice sites are likely to be similar to canonical splice sites.
- There is a negative correlation between the frequency of non-canonical splice site combinations and their diver- gence from canonical sequences (r.
- Splice sites with one difference to a canonical splice site are more frequent than more diverged splice sites.
- A similar trend can be observed around the major non-canonical splice sites AT-AC (Fig.
- vinifera (Additional files 10, 11 and 12), there were slightly less genes with non-canonical splice sites close to the centro- meres.
- RNA-Seq reads supported 224 of these CA-GG splice sites.
- Non-canonical splice sites in single copy genes.
- The average percentage of genes with non-canonical splice sites among single copy BUSCO genes was 11.4%.
- splice sites among BUSCO genes (Additional file 14).
- A couple of species displayed an inverted situation, having less genes with non-canonical splice sites among the BUSCO genes than the genome-wide average..
- Length distributions of introns with canonical and non-canonical splice site combinations are similar in most regions (Fig.
- These distributions indicate that non-canonical splice sites are more frequent in introns that deviate from the average length.
- Stress-related genes were checked for increased intron sizes, because non-canonical splice site combinations might be associated with stress-response.
- The likelihood of having a non-canonical splice site in a gene is almost perfectly correlated with the num- ber of introns (Additional file 15).
- Conservation of non-canonical splice sites.
- Non-canonical splice site combinations detected in A..
- Of 1296 non-canonical splice site combinations, 109 over- lapped with listed variant positions.
- To differentiate between randomly occurring non- canonical splice sites (e.g.
- sequencing errors) and true bio- logical variation, the conservation of non-canonical splice sites across multiple species can be analyzed.
- Manual inspection revealed that non-canonical splice sites were conserved in three posi- tions in many putative homologous genes across various species (Additional file 16)..
- Medicago truncatula, Oryza sativa, Populus trichocarpa, Monoraphidium neglectum, and Morus notabilis displayed substantially lower valid- ation values for the major non-canonical splice sites..
- The same trend holds true for major non-canonical GC-AG splice site.
- Most striking differences are (1) at the intron length peak around 200 bp where non-canonical splice site combinations are less likely and (2) at very long intron lengths where introns with non-canonical splice sites are more likely.
- Major non-canonical AT-AC and minor non-canonical splice sites did not show a difference between 5′ and 3′.
- minor non-canonical splice site combinations to 0.82 in major non-canonical AT-AC splice site combinations..
- In order to provide an example for the usage of minor non-canonical splice sites under stress conditions, four single RNA-Seq data sets of B.
- The number of RNA-Seq supported minor non-canonical splice site combinations increased between control and stress conditions from 17.
- Occurrences of the canonical GT-AG, the major non-canonical GC-AG and AT-AC as well as the combined occurrences of all minor non-canonical splice sites (others) are displayed.
- 5 Usage of splice sites.
- Canonical GT-AG splice site combinations are used more often than major or minor non-canonical splice site combinations.
- Our results update and expand previous systematic analyses of non-canonical splice sites in smaller data sets .
- Our analyses supported a variety of different non-canon- ical splice sites matching previous reports of bona fide non-canonical splice sites .
- Frequencies of different minor non-canonical splice site combinations are not random and vary between different combina- tions.
- Those combinations similar to the canonical com- bination or the major non-canonical splice site combinations are more frequent.
- GT-AG canonical splice sites is in agreement with recent reports for A.
- findings together, both major and minor non-canonical splice sites could be a more significant phenomenon of splicing in plants than in animals.
- An in-depth investigation of non-canonical splice sites in animals and fungi would be needed to validate this hypothesis..
- Species-specific differences in minor non-canonical splice site combinations.
- Nevertheless, con- served non-canonical splice site positions exist as presented on the gene level for At1g79350.
- The group of minor non-canonical splice sites dis- played the largest variation between species, and a fre- quent non-canonical splice site combination (CA-GG) which appeared peculiar to O.
- thaliana support this con- jecture and suggest that some non-canonical splice sites are conserved in homologous loci at the intra-specific level.
- Putative mechanisms for processing of minor non- canonical splice sites.
- We sought to understand possible correlations with minor non-canonical splice site combinations in order under- stand the mechanisms driving their occurrence.
- Further investigation might connect neighbouring sequences to the processing of minor non-canonical splice sites..
- Usage of non-canonical splice sites.
- As previously indicated by several re- ports, non-canonical splice sites might be more fre- quently used under stress conditions .
- Splice sites of interest might be canonical splice site combinations in some accessions or subspecies, respectively, while they are non-canonical in others.
- Therefore, we cannot exclude that certain non-canonical splice sites were missed in our splice site usage analysis due to a lack of gene expression under the investigated conditions..
- Investigation of homologous non-canonical splice sites poses several difficulties, as the exonic sequence is not necessarily conserved.
- How- ever, a computationally feasible approach to investigate the phylogeny of all non-canonical splice sites would sig- nificantly enhance our knowledge e.g.
- about the emer- gence and loss of non-canonical splice sites.
- Splice sites could be experi- mentally validated e.g.
- Non-canonical splice site combinations are present and appear to be functionally relevant in most plants, although at low abundance.
- Additional file 6: Number of splice sites per species.
- Canonical and non-canonical splice sites were counted per species as described in the method section.
- Additional file 8: Similarity of the non-canonical splice site pattern across plants.
- For each investigated species the number of canonical and non- canonical splice sites is displayed.
- The Spearman correlation coefficient between splice site number and genome size is r = 0.14 for canonical splice sites and r = 0.02 for non-canonical splice sites.
- (JPG 250 kb) Additional file 10: Genome-wide distribution of non-canonical splice sites in A.
- The distribution of genes with non-canonical splice sites (red dots) across the five chromosome sequences (black lines) of A..
- Additional file 11: Genome-wide distribution of non-canonical splice sites in B.
- The distribution of genes with non-canonical splice sites (red dots) across the nine chromosome sequences (black lines) of B..
- Additional file 12: Genome-wide distribution of non-canonical splice sites in V.
- The distribution of genes with non-canonical splice sites (red dots) across the 19 chromosome sequences (black lines) of V.
- Additional file 13: Conserved sequences around splice sites in Oryza sativa.
- Additional file 14: Non-canonical splice sites in single copy genes.
- The occurrence of non-canonical splice sites in single copy genes (BUSCO) and in all genes was assessed per species.
- Additional file 15: Proportion of non-canonical splice sites.
- The green line indicates the average (median) proportion of genes with a non- canonical splice site combination.
- Genes with more introns are more likely to have a non-canonical splice site combination.
- Additional file 16: Conservation of non-canonical splice sites.
- Non- canonical splice sites at conserved positions in putative homologous of At1g79350 across various species.
- Additional file 17: Supported splice sites.
- Percentage of splice sites supported by RNA-Seq reads is given per species.
- Lessons from non-canonical splicing.
- A reappraisal of non-consensus mRNA splice sites.
- Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.
- Analysis of canonical and non- canonical splice sites in mammalian genomes.
- RNA-Seq read coverage depth of splice sites in plants.
- serine (SR) proteins in maize are differentially spliced and utilize non- canonical splice sites.
- A comprehensive survey of non- canonical splice sites in the human transcriptome

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt