« Home « Kết quả tìm kiếm

Pan-tissue transcriptome analysis of long noncoding RNAs in the American beaver Castor canadensis


Tóm tắt Xem thử

- The genome of the American beaver ( Castor canadensis ) has recently been sequenced, setting the stage for the systematic identification of beaver lncRNAs and the characterization of their expression in various tissues.
- Of the high-confidence lncRNA contigs, 147 have no known orthologs (and thus are putative novel lncRNAs) and 40 have mammalian orthologs.
- While the novel lncRNAs were on average shorter than their annotated counterparts, they were similar to the annotated lncRNAs in terms of the relationships between contig length and minimum free energy (MFE) and between coverage and contig length.
- We profiled the expression of the 187 high-confidence lncRNAs across 16 beaver tissues (whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovary, placenta, castor gland, tail, toe-webbing, and tongue) and identified both tissue- specific and ubiquitous lncRNAs..
- LncRNAs — both novel and those with known orthologs — are expressed in each of the beaver tissues that we analyzed.
- 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- Full list of author information is available at the end of the article.
- Mounting evi- dence implicating species-specific ncRNAs and gene regulatory mechanisms in species adaptations [3, 5], in- cluding various species-specific responses to hypoxia [3, 4], suggests that species-specific and taxon-specific lncRNAs may underlie some of the adaptations seen in mammalian evolution.
- The genome and three tissue transcriptomes of the American beaver Castor canadensis (Order Rodentia, Family Castoridae) have recently been sequenced [7, 8], enabling the systematic search for molecular determi- nants of this semi-aquatic herbivore’s unique physio- logic, anatomic, and behavioral adaptations.
- and (3) measuring expression levels of the lncRNA contigs in the 16-tissue atlas.
- From the measured expression levels of the 187 lncRNAs across the 16 tissues, we (i) identified both tissue-specific and tissue-ubiquitous lncRNAs, (ii) correlated tissue expres- sion profiles of three beaver lncRNAs with the tissue ex- pression profiles of their orthologs and (iii) identified biological pathways and biological processes that beaver lncRNAs may regulate.
- In order to obtain a comprehensive profile of the non- coding transcriptome of the American beaver, we paired-end sequenced polyadenylated RNA pooled from samples of sixteen different beaver tissues and de novo assembled a “pan-tissue” beaver polyadenylated RNA transcriptome using Trinity (see Methods).
- We merged the transcript contigs into 86,714 non-redundant contigs which became the basis for the remainder of the lncRNA screen.
- As a test of the completeness of the pan-tissue beaver polyadenylated RNA transcriptome, we used a benchmark set of 4014 genes (the mammalian Bench- marking Universal Single-Copy Ortholog [BUSCO].
- We found that 66% of the mammalian BUSCO genes had high- confidence (E <.
- (2) filtering based on contigs’ coding potential score (p ≤ 0.01) as predicted based on their hexamer sequence content and the length of and coverage of the transcript by the longest Open Reading Frame (ORF).
- Kolmogorov-Smirnov test) longer than those of the novel lncRNAs (Fig.
- Columns as follows: “Step”, the name of the program or step in the screening pipeline.
- Contigs Eliminated”, the percentage of contigs from Column 4 of the previous row in the table that were eliminated in this step of the analysis pipeline.
- Contigs Eliminated”, the number of contigs corresponding to the.
- identity, and over 91% of putative novel lncRNA contigs had an alignment equivalent to at least 70% of the con- tig’s length (Additional file 1 Figure.
- Of the 144 aligned contigs, all of them had greater than 90% sequence mapped and 140 of them had greater than 95% sequence mapped..
- Novel lncRNAs in the American beaver.
- This concordance between length and MFE is not surprising in light of the inverse relationship between transcript length and secondary structural stability (Fig.
- All of the eight novel contigs had robust ex- pression.
- Interestingly, none of the eight lncRNAs were among those contigs with the highest coverage.
- This may be ex- plained by the weakness of the relationship between length and observed coverage of novel lncRNA tran- scripts (Fig.
- Furthermore, among the novel tran- scripts, the four contigs with exceptionally high coverage had coverage that was, on average, 15-fold greater than that of the rest of the contigs.
- Of the 40 lncRNA contigs for which a high-confidence ortholog gene could be identified, the ortholog annota- tions included 16 long noncoding RNA genes, 12 non- coding antisense RNAs, ten noncoding isoforms of protein-coding genes, and two sense-overlapping RNAs (Table 3).
- To assess the possible functional coherence of the bea- ver lncRNAs with known orthologs, we analyzed KEGG biological pathway annotations for the human orthologs of the Table 3 (ortholog-mapped) lncRNAs for statistical enrichment (see Methods).
- Following the lncRNA discovery phase of the analysis, we used RNA-seq to analyze lncRNA levels in the 16 beaver tissues or anatomic structures (the same set of tissues from which we constructed the pooled transcrip- tome library): whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovar- ies, placenta, castor gland, tail skin, toe-webbing, and tongue.
- For each of the 187 contigs 1 and in each of the 16 tissues, we estimated the transcript abundance in RPKM (see Additional file 6 Table S2 and Methods)..
- Heatmap visualization of the tissue-specific expression profiles of the 147 novel (Fig.
- 4 Tissue-specific expression of novel lncRNAs in the American beaver.
- Ccan_OSU1_.
- As an independent check on the biological validity of the RNA-seq-based lncRNA gene expression measure- ments, we compared the log 2 expression in muscle of all 187 known and novel lncRNAs as measured in our study and by the Lok et al.
- We were able to map eight of the genes to mammalian orthologs (ERGIC2, RAD23, TP53RK, SCRN3, RAD21, RAD5, SECISBP2, PPARD) (see Methods).
- The functional annotations of the eight ortho- log genes are enriched for the Gene Ontology biological process DNA Recombination (P suggesting that the lncRNA contig81051.1 may be involved in regu- lating chromatin maintenance..
- Second, contigs contig6442.1 and con- tig11359.1, which are orthologs of the mammalian lncRNA MEG3, are strongly expressed in placenta, spleen, brain, ovary, tongue, lung, and heart.
- Finally, we note that four beaver lncRNAs (contig81530.1, contig29471.1, con- tig79757.1, and contig27553.1) all cluster together in terms of gene expression and they are all orthologous to noncoding isoforms of the human gene potassium voltage-gated channel, shaker-related subfamily, member 3 (KCNA3).
- “annotation”, classification of the lncRNA transcript type if it is not an obligate lncRNA gene or if it is antisense to a protein-coding gene (i.
- the name of the transcript contig.
- “Species”, the species in which orthologs of the contig were detected by sequence similarity.
- Ensembl Gene ID, the Ensembl gene identifier of the putative human ortholog.
- “BLASTn annotation”, the annotation of the BLASTn hit corresponding to the statistics in the last three columns (E, %ID, nt).
- For the lncRNA contigs with known orthologs that are expressed in all of the beaver tissues, in general their human orthologs are ubiquitously expressed.
- The minimum-free energy secondary structure of the putative beaver MEG3 lncRNA (Fig.
- The analysis revealed no evidence of the existence of a murine ortholog of Ccan_.
- Although this work focused on discovering beaver lncRNAs using multi-tissue transcriptome profiling, some novel aspects of the bioinformatics workflow that we used are worth noting.
- 7 Predicted minimum-free energy secondary structures of the putative beaver MEG3 lncRNA Ccan_OSU1_lncRNA_contig11359.1 (a) and the homologous sequence of human MEG3 (b).
- Given the likelihood that many if not most of the novel contigs are partial transcripts, it seems plausible that this difference in lengths reflects the fact that a longer contig is less likely to miss the phylogenetically conserved portion of the gene.
- 8 Predicted minimum-free energy secondary structure of the novel spleen- and ovary-specific lncRNA Ccan_OSU1_lncRNA_contig44966.1, showing relatively high pairing probabilities.
- length that was in excess of 90% of the contig’s length..
- Furthermore, the mapping serves as a preliminary step in examining the genomic context of the putative lncRNA gene.
- Fi- nally, the pathway enrichment analysis of human ortho- logs of the 40 ortholog-mappable lncRNA contigs (which are biased toward high expression in at least one tissue type) identified several pathways, including “ribosome”,.
- A signature adaptation of the beaver is its ability to withstand hypoxia, the response to which in mammals is known to reprogram intracellular calcium signaling [49], downregulate protein synthesis [50], and activate neuroendocrine [51] pathways..
- One caveat of this analysis is that, in light of a recent report that some lncRNAs may encode micropeptides [52, 53], the stringent cutoff used to filter for coding po- tential of the lncRNA contigs likely eliminated some lncRNA contigs.
- For several of the 40 lncRNA contigs with known ortholog genes (e.g., MEG3, RP11-415F23.2, AC079135.1, KCNA3), we found consistent patterns of tissue-specific expression between the beaver transcript contigs and the ortholog genes, bolstering evidence for the ortholog mappings and confirming previous reports that tissue-specific expression of noncoding RNAs is often phylogenetically conserved across ortholog pairs [54].
- For MEG3, the consistency of predicted sec- ondary structure of the beaver lncRNA contig and the.
- More broadly, the overall pattern of tissue-specific expression of the known lncRNA contigs in beaver grouped related tis- sues (e.g., skeletal muscle, heart, and tongue in one sub- group, and kidney and stomach in another subgroup), consistent with previously published results for mouse [34].
- We annotated the 40 known lncRNAs based on their orthologs and confirmed consistency of tissue expression (between beaver and the orthologous species) for several of the lncRNAs for which ortholog tissue expression data could be obtained.
- Eight of the novel lncRNA con- tigs have especially strong evidence across five different heuristics for biological significance and may be the most promising contigs to use as a basis for hypothesis generation for targeted functional investigations.
- To the best of our knowledge, this work is the first comprehensive tissue transcriptome analysis of the beaver.
- see Availability of data and materials) will provide a foundation for improving annotation of the beaver genome, characterizing tissue expression of all beaver genes, extending rodent comparative genom- ics, and elucidating the biological mechanisms under- lying the beaver’s unique adaptations..
- From each of the 16 homogenized tissue samples, we isolated total RNA using the Zymo Direct-zol RNA MiniPrep (Zymo Research) kit.
- Tissue transcriptome atlas: For each of the sixteen tis- sues, we prepared barcoded cDNA libraries for paired- end Illumina sequencing in triplicate using the Truseq Stranded mRNA Library Prep Kit (Illumina).
- We se- quenced the sixteen tissue samples for 2 × 150 cycles on one lane of the HiSeq 3000 (Illumina), obtaining an average of 21.4 million read pairs per sample (across- samples standard deviation of 3.0 million read pairs)..
- Starting with the paired FASTQ files from the MiSeq se- quencing of the pooled tissue RNA libraries, we bioin- formatically trimmed overrepresented polyadenine and adapter sequences using fastq_clipper v534 (github.com/.
- This step also had the effect of reducing computational complexity for the remainder of the pipeline..
- To estimate the transcriptome coverage of highly- conserved mammalian genes across the sixteen tissues, we used the BUSCO software v2.0 [24] on six pan-tissue transcriptome assemblies: (i) the de novo Trinity assem- bly, before modification by transfuse, (ii) a transcript file generated using Maker Gene Models [60] analysis of the reference genome.
- For each of the 86,714 contigs, we searched for orthologs using BLASTn [67] against the NCBI Nucleotide Database [68], with an E-value threshold of 10 − 3 .
- In order to eliminate contigs that are likely untrans- lated region (UTR) portions of protein-coding tran- scripts, we aligned the remaining 182 high-confidence noncoding, “no orthologs” contigs to scaffold sequences of the Oregon State University draft beaver genome as- sembly using BLASTn.
- 9 Overview of the computational pipeline for identifying beaver lncRNAs.
- We ignored a match if any of the following phrases (or their abbreviations) appeared in the subject sequence title: predicted, synthetic construct, bacterial artificial chromosome, P1-derived artificial chromosome, predicted gene, transgenic, mutant al- lele, clone, cloning vector, hypothetical, complete genome..
- (ii) possible lncRNA if both lncRNA and protein-coding mRNA BLASTn matches were approximately equally abundant and of approximately equal quality as measured by length and percent identity of the BLASTn hit.
- We annotated any contigs that did not fall into the above classification categories based on manual inspection of the Ensembl gene model in the context of the contig ’ s Basic Local Alignment Tool (BLAT) match to the human (GRCh38) or mouse (GRCm38) genome assemblies.
- We computed aver- age contig coverage of the contigs by the RNA-seq reads, using samtools v1.9..
- For each tissue sample, we normalized read counts by the total number of reads in the sample and computed the log 2 of the zero-inflated normalized counts.
- We used NCBI BLASTn for ortholog mapping and Enrichr for the functional enrichment ana- lysis of the orthologs of co-expressed genes..
- Manual curation of the 40 known lncRNAs .
- Category, the classification of the contig.
- Species, the species of the subject from the previous column.
- Description, the BLASTn descriptor of the subject sequence.
- type, the classification of the contig as “ known ” or “ novel.
- remaining columns are of the format.
- where TissueType is one of the 16 tissues collected and profiled (see Methods Section “ Sample Collection.
- We thank the staff of the Oregon State University Center for Genome Research and Biocomputing, Jessica Nixon, Aaron Trippe, Mark Dasenko, Matthew Peterson, and Chris Sul- livan, for technical assistance..
- This work was carried out with support from the SeqTheBeav project (beavergenome.org), a collaborative effort that crowd-funded the sequen- cing of the genome of the American beaver, the mascot of Oregon State University.
- The authors thank all of the SeqTheBeav project volunteers and donors who made the project possible.
- A special note of thanks to Jeannine Cropley (CGRB) and Keaton Kirkpatrick (OSU Foundation) for their support of the project.
- The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript..
- Repression of the long noncoding RNA-LET by histone Deacetylase 3 contributes to hypoxia-mediated metastasis.
- Genome and Transcriptome Assembly of the Canadian Beaver.
- Fiber digestion in the beaver.
- Substrate-driven convergence of the microbial Community in.
- Hibernation induces widespread transcriptional remodeling in metabolic tissues of the grizzly bear.
- Regulation of the cohesin-loading factor NIPBL: role of the lncRNA NIPBL-AS1 and identification of a distal enhancer element.
- Comparison of the transcriptional landscapes between human and mouse tissues.
- Acute hypoxia activates neuroendocrine, but not presympathetic, neurons in the paraventricular nucleus of the hypothalamus: differential role of nitric oxide..
- Database resources of the National Center for biotechnology information

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt