« Home « Kết quả tìm kiếm

Millefy: Visualizing cell-to-cell heterogeneity in read coverage of single-cell RNA sequencing datasets


Tóm tắt Xem thử

- heterogeneity in read coverage of single-cell RNA sequencing datasets.
- Background: Read coverage of RNA sequencing data reflects gene expression and RNA processing events..
- Single-cell RNA sequencing (scRNA-seq) methods, particularly “full-length” ones, provide read coverage of many individual cells and have the potential to reveal cellular heterogeneity in RNA transcription and processing.
- However, visualization tools suited to highlighting cell-to-cell heterogeneity in read coverage are still lacking..
- Results: Here, we have developed Millefy, a tool for visualizing read coverage of scRNA-seq data in genomic.
- Millefy is designed to show read coverage of all individual cells at once in genomic contexts and to highlight cell-to-cell heterogeneity in read coverage.
- By visualizing read coverage of all cells as a heat map and dynamically reordering cells based on diffusion maps, Millefy facilitates discovery of “local” region-specific, cell-to-cell.
- heterogeneity in read coverage.
- We applied Millefy to scRNA-seq data sets of mouse embryonic stem cells and triple-negative breast cancers and showed variability of transcribed regions including antisense RNAs, 3 UTR lengths, and enhancer RNA transcription..
- Conclusions: Millefy simplifies the examination of cellular heterogeneity in RNA transcription and processing events using scRNA-seq data.
- Keywords: Single-cell RNA sequencing, Visualization, Read coverage.
- In scRNA-seq data analyses, visualization is crucial for quality control (QC) as well as exploratory data analyses.
- To date, many tools for visualizing gene expression matrices of scRNA-seq data have been proposed [3]..
- Moreover, visual inspection of read coverage enables quality assessment of experimen- tal methods (e.g., whether amplification is biased [7]) and bioinformatic methods (e.g., the accuracy of expression level estimation)..
- Given that scRNA-seq has revealed cellular heterogene- ity in gene [8] and splicing isoform expression [9, 10], visu- alization of read coverage of scRNA-seq data is expected to reveal cellular heterogeneity in read coverage, which can be interpreted as biological (e.g., transcription and RNA processing) and technical (e.g., amplification biases) heterogeneity.
- Read coverage is informative, especially for so-called “full-length” scRNA-seq methods such as Smart- seq2 [11] and RamDA-seq [12], compared with “3 -tag sequencing” scRNA-seq methods, which sequence only the 3 ends of RNAs and cannot be used to extract rich information from read coverage [13, 14].
- Despite their potential importance, however, tools specifically for the visualization of read coverage of scRNA-seq data are still lacking..
- To explore cell-to-cell heterogeneity in read coverage, we propose several requirements of a tool for visualization of read coverage in scRNA-seq data (Table 1).
- First, the tool must be able to display read coverage of all individ- ual cells in a scRNA-seq dataset at once.
- This is because scRNA-seq data consist of many cells and fre- quently includes latent heterogeneity that is masked by the summation of expression across cells.
- Second, the tool must associate read coverage with genomic con- texts, such as gene structures and epigenomic features, because read coverage data can be interpreted only when it is displayed simultaneously with their genomic contexts..
- Third, the tool must highlight the cell-to-cell heterogene- ity of read coverage within focal regions.
- This is because there should be “local” region-specific cell-to-cell hetero- geneity in read coverage at transcriptional (e.g., antisense RNAs and eRNAs) and post-transcriptional (e.g., alterna- tive splicing) levels, and such heterogeneity is difficult to notice in advance by cell groupings defined according to global similarity among cells..
- Genome browsers and heat maps are two major tools for read coverage visualization.
- Associate read coverage.
- Highlight cell-to-cell heterogeneity in read coverage.
- in genome browsers, read coverage can easily be com- pared with other features and be interpreted in genomic contexts like gene models and epigenomic signals, which helps to generate and validate biological hypotheses..
- However, existing genome browsers are not suited for the large numbers of samples (i.e., cells) in scRNA-seq experiments.
- Indeed, efforts to visualize read coverage of scRNA-seq data using genome browsers have been lim- ited to displays of a few dozen cells without the need to scroll [17, 18].
- Although IGV and JBrowse implement heat map representations of tracks to show many cells at once, they cannot dynamically reorder tracks to reveal local cell-to-cell heterogeneity in read coverage..
- Tools for heat maps combined with clustering algo- rithms have been used in the analysis of scRNA-seq data..
- Thus, heat maps can be used to visualize read coverage of all cells at once and reveal heterogeneity in read coverage..
- However, tools for generating heat maps are unsuited for visualizing read coverage of scRNA-seq data in genomic contexts, or they lack functionality to directly extract read coverage from standard NGS data formats..
- Here, we have developed Millefy, which combines genome-browser-like visualization, heat maps, and dynamic reordering of single-cell read coverage and thus facilitates the examination of local heterogeneity within scRNA-seq data.
- Millefy extracts and organizes various types of useful information from read coverage of scRNA-seq data..
- Millefy visualizes read coverage from each individual cell as a heat map in which rows represent cells and columns represent genomic bins within a focal region.
- The heat map is aligned with tracks for gene annotations, genomic features, and bulk NGS data, enabling comparisons of single-cell read coverage with genomic contexts.
- Millefy supports iterative adjustment of plots by the millefy_adjust() function, which reuses the read coverage matrices of the last plot, enabling faster adjust- ment than simply replotting.
- In scRNA-seq tracks and bulk NGS data tracks, read coverage is normal- ized by user-provided normalization factors to correct for.
- Using the above tracks, Millefy can simultaneously display read coverage of each cell and mean read cover- age of cells in each user-defined cell group as well as align scRNA-seq data with genome annotation data and NGS data..
- Millefy was implemented in R and can import scRNA- seq data without the need for format conversion.
- For scRNA-seq data, Millefy accepts BAM and BigWig for- mats, which are standard file formats for NGS data anal- ysis.
- a Millefy imports scRNA-seq data and visualizes read coverage of individual cells as a heat map.
- This automatic reordering highlights cell-to-cell heterogeneity in read coverage, which is hidden by mean read coverage data.
- Millefy associates genomic contexts, including bulk NGS data, genomic features, and gene annotations, thus facilitating the interpretation of single-cell read coverage.
- White boxes represent computation of a read coverage matrix.
- For performing diffusion maps on read coverage data, Millefy utilizes the destiny package [23]..
- However, in such cases, the merged (or averaged) read coverage cannot capture heterogene- ity in read coverage.
- For example, a change in the merged read coverage cannot indicate whether the number of cells expressing a gene increased or the expression level of that gene increased across all cells.
- In contrast, Millefy visualizes read coverage of all individual cells in a scRNA- seq dataset as a heat map and thereby provides detailed information on cellular heterogeneity in read coverage..
- To demonstrate the usefulness of Millefy’s ability to visualize read coverage in scRNA-seq data, we used a time-course RamDA-seq dataset derived from mouse embryonic stem cells (mESCs) upon induction of cell dif- ferentiation to primitive endoderm cells (at and 72 h) [12].
- Figure 2 shows the read coverage at Sox17, a differen- tiation marker gene.
- Cells were reordered according to the first diffusion component values calculated by a dif- fusion map of read coverage data for the locus, either within user-defined cell groups (Fig.
- While the height of the mean read cover- age increased along the differentiation time course, the reordered heat map highlights the heterogeneity of read coverage among cells from the same time points (e.g., the 12 h group) (Fig.
- Figure 3 shows read coverage of 421 individual cells at the Zmynd8 locus.
- The cells were dynamically reordered using diffusion maps based on the read coverage in the focal region.
- We note that the averaged read coverage for each time point cannot distinguish whether the long and short isoforms of Zmynd8 and Zmynd8as are correlated or uncorrelated..
- These results demonstrate that Millefy’s functionality for displaying read coverage as a reordered heat map reveals cell-to-cell heterogeneity at the focal locus..
- Millefy application on scRNA-seq data from triple-negative breast cancer patients.
- We also applied Millefy to a scRNA-seq dataset from triple-negative invasive cancer (TNBC) patients:.
- Specifically, for c-JUN, some cells showed short read coverage and others showed long read coverage (Fig.
- In the last exon of NRAS, many cells showed long 3 UTR read coverage but some cells showed a shortened 3 UTR read coverage (Fig.
- Such heterogeneity cannot be determined by the aggregated (averaged) read coverage alone (Fig.
- Millefy associates read coverage with genomic contexts to facilitate interpretation of read coverage.
- Genomic contexts are crucial for interpretation of read coverage in bulk and single-cell RNA sequencing data.
- For example, read coverage overlapped with gene annotations can confirm known and reveal novel exon-intron struc- tures.
- Moreover, read coverage overlapped with enhancer annotations can be interpreted as eRNA expression [6]..
- Using Millefy, single-cell read coverage can be compared.
- 2 Millefy visualization of read coverage at the Sox17 locus.
- The top heat map shows single-cell read coverage.
- The middle tracks show the averaged read coverage at different time points, the bulk RNA sequencing read coverage, and enhancer annotations.
- To demonstrate the usefulness of the simultaneous visualization of single-cell read coverage and genomic contexts, we compared read coverage of the RamDA-seq data from mESCs (0 h) with mESC enhancer regions..
- Figure 5 displays read coverage at the Myc locus, with the positions of enhancers active in mESCs.
- The Myc gene models and read coverage reveal that Myc was transcribed in mESCs.
- This result exemplifies how Millefy can help to interpret read coverage of scRNA-seq data in genomic contexts..
- Millefy facilitates quality control in full-length scRNA-seq methods.
- Millefy can also be used for QC in full-length scRNA- seq methods.
- For example, scRNA-seq read coverage of long transcripts indicates whether the method employed provided full-length transcript coverage.
- Full-length tran- script coverage provides accurate information about iso- form expression and gene structures and is a fundamental feature of full-length scRNA-seq methods [29]..
- Figure 6 shows the read coverage at Mdn1, a gene with a long transcript (17,970 bp) consisting of 102 exons.
- The lower reproducibility in read coverage of C1- SMART-seq V4 relative to C1-RamDA-seq is likely owing to technical noise because the samples were prepared not from living cells but from a dilution of 10 pg of RNA..
- We note that mean read coverage cannot provide such detailed information on reproducibility in read coverage.
- 5 Millefy visualization of read coverage and enhancer regions around the Myc locus.
- 6 Example of quality control of scRNA-seq methods.
- Visualization of read coverage from C1-RamDA-seq (n = 95) and C1-SMART-Seq V4 (n = 96) data from a dilution of 10 pg of RNA at the Mdn1 locus by Millefy.
- This result exemplifies how Millefy can be used for quality control of scRNA-seq methods..
- In this paper, we proposed Millefy, a tool for visualizing cell-to-cell heterogeneity in read coverage in scRNA-seq data.
- Millefy combines genome-browser-like visualiza- tion, heat maps, and dynamic reordering of single-cell read coverage.
- Thereby, Millefy can display read coverage of all cells at once, associate read coverage with genomic.
- contexts, and highlight the cell-to-cell heterogeneity of read coverage within focal regions (Table 1)..
- Using scRNA-seq data of mESCs and TNBC, we demonstrate the effectiveness of Millefy to reveal local heterogeneity in read coverage within scRNA-seq data..
- Third, by associating read coverage with enhancer anno- tations, Millefy helped to interpret RNA transcription events in non-coding regions (Fig.
- Collectively, these results indicate that Millefy enables the exploration of cel- lular heterogeneity of various biological events from read coverage of scRNA-seq data, which could be missed by conventional visualization tools..
- Using scRNA-seq data of diluted RNA with different full-length scRNA-seq meth- ods (Fig.
- 6), we demonstrate that Millefy visualizes read coverage in scRNA-seq as a QC measure and comple- ments existing scRNA-seq QC pipelines based primarily on gene expression matrices [3].
- In the development of bioinformatics methods using rule-based and machine learning approaches for profiling alternative splicing or novel RNAs by scRNA-seq, read coverage visualization tools like Millefy will become more important for eval- uating and representing the predictions of algorithms..
- Millefy, which is integrated with Jupyter Notebook and provided as a Docker image, can easily be utilized in exploratory analyses of scRNA-seq data.
- In conclusion, Millefy will provide new opportunities to analyze scRNA- seq data from the point of view of cell-to-cell heterogene- ity in read coverage, and help researchers assess cellular heterogeneity and RNA biology using scRNA-seq data..
- scRNA-seq: single-cell RNA sequencing.
- Single-cell rna-seq:

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt