- Single Cell Explorer, collaboration-driven tools to leverage large-scale single cell RNA-seq data. - Background: Single cell transcriptome sequencing has become an increasingly valuable technology for dissecting complex biology at a resolution impossible with bulk sequencing. - Results: Single Cell Explorer is a Python-based web server application we developed to enable computational and experimental scientists to iteratively and collaboratively annotate cell expression phenotypes within a user-friendly and visually appealing platform. - These annotations can be modified and shared by multiple users to allow easy collaboration between computational scientists and experimental biologists. - Data processing and analytic workflows can be integrated into the system using Jupyter notebooks. - identification of differential gene expression patterns for user-defined cell populations and convenient annotation of cell types using marker genes or differential gene expression patterns. - As such, by making single cell RNA-seq data sharing and querying more user-friendly, the software promotes deeper understanding and innovation by research teams applying single cell transcriptomic approaches.. - Conclusions: Single cell explorer is a freely-available single cell transcriptomic analysis tool that enables computational and experimental biologists to collaboratively explore, annotate, and share results in a flexible software environment and a centralized database server that supports data portal functionality.. - Rapidly evolving single cell sequencing technologies are en- abling researchers to generate data that have the potential to lead to unprecedented biological insight, albeit at the cost of greater complexity of analysis. - Open-source, point-and-click, web-based interfaces have become a popular choice to share the analytic results of single cell experiments [1]. - increase in the creation of experiment types, pipelines and methods, it may be considered impossible to generate a sin- gle graphical user interface (GUI) that covers a large number of methods without impairing usability. - We developed Single Cell Explorer using hybrid ap- proaches, including the application of a Python based programming environment and web app GUI, to enable result sharing and fluid data exploration. - 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0. - Full list of author information is available at the end of the article. - notebook is becoming increasingly popular in the bio- informatics research community [5]. - Single Cell Ex- plorer’s GUI was developed with a focus on easy use and intuitiveness for experimental biologists to explore with minimal training. - Single Cell Explorer was developed as a generalized platform for research teams to share and use single cell transcriptome data generated from either pipelines or processed data, with full open access to complex workflows, tools, and methodologies-all behind a simple interface. - In contrast to the existing R-based frameworks, Single Cell Explorer will scale to large col- lections of studies by integrating with modern, perfor- mant databases and workflows such as Scanpy [6].. - Single Cell Explorer was written using the Python 3.0 programming language, and built with the Django framework. - It can be launched by servers which support the Python environ- ment. - A component view of the system is shown in Fig. - 1a, which reveals the integration of analytic pipelines via the Single Cell Explorer database. - This step includes read mapping alignment, gene quantitation, and quality control employing Cell Ranger v3.0 (http://10xgenomics.com) to process Chromium single-cell RNA-seq FASTQ data. - Alternatively, raw data can be processed using Bash or Nextflow. - The principal output of this step includes the filtered cell/gene expression matrix as well as the matrix describing the 2D coordinates of the cells in lower dimensional space.. - After the data has been loaded, the web front end enables users to visualize and query downstream analytic results through interaction with the lower-dimensional map of the cells. - In addition to basic data exploration, cell type annotations can be captured by users and stored.. - Single cell RNA-seq data processing and analysis. - As an example of the utility of Single Cell Explorer, a test run was performed on a publicly available dataset of hu- man peripheral blood mononuclear cells (PBMCs) from (https://support.10xgenomics.com/single-cell-gene-ex- pression/datasets). - The Cell Ranger pipeline can be started using runCellrangerProcess, a function in the notebook, and is followed by the Scanpy analytic workflow in the same Jupyter notebook for quality control and dimensionality reduction. - The project metadata, cell/gene expression matrix, normalized data, and results of the 2D cell mapping will be uploaded by the notebook to the MongoDB instance.. - For high-dimensional single cell data, lower dimensional representations such as t-SNE or UMAP are necessary to interact with the data and to easily observe broad relation- ships between cells (Fig. - Single Cell Explorer supports all types of low-dimensional representation [7]. - Here we showed the re-analysis of single cell RNA-seq data for cells from the early human maternal-fetal interface [8]. - The mul- tiple types of metadata, including cell types, cluster infor- mation, and sample information such as tissue, donor, and any other clinical features, can be overlaid on the feature plot. - The user interface provides a simple gene expression search function for each feature plot. - 1) will be shown for querying single gene expression. - The interface also supports queries for two genes simultaneously, with the gene expression pattern painted with different colors. - Cell type identification and annotation. - A key challenge for analyzing single cell RNA-seq data is choosing an optimal cell clustering parameter that best de- lineates the key cell sub types. - Note that the default resolution value of 1 produces more clusters of cell types than the key major im- mune cell types we typically like to identify in our routine expression analyses. - 1 Single Cell Explorer workflow architecture process and component view. - a Overview of the data process workflow steps for Single Cell Explorer. - Step #3: Interactive data analyses and annotation of cell types. - Step #5: All results from MongoDB can be accessed directly or via API. - b A screenshot for Single Cell Explorer data navigator page and a t-SNE map for one dataset. - Individual pre-labeled cell types are painted in different colors. - A 2D plot of circles indicates the proportion of the single positive and double positive cells. - b To query a list of genes, a heatmap can be generated after freehand selection of cells of interest. - ILC3 cells can be identified using markers including KIT and DLL1. - 3 Understanding single cell clustering results. - c A heatmap of marker gene expression within each cluster defined by leiden algorithm. - 4 Cell type and feature discovery. - Step #4: Interactively visualize gene expression levels using the resulting table. - Step #5: Record cell types and marker genes for future reference. - Step #6: Position the newly-labelled cells on the map and compare with other specific cell types. - Single Cell Explorer enables users to examine expression of key cell type markers in UMAP (e.g. - Users can match the marker gene expression with the cell cluster identification number using the heatmap function (Fig. - When cell type com- position is unclear or cell marker information remains un- known, differentially expressed genes can be obtained first.. - Then these newly identified cell markers or domain know- ledge can be used to annotate cell clusters in the 2D map (Fig. - The user can name the cell type by choosing cell type name from a list (to enforce controlled vocabulary), or add new names that do not exist in the data- base. - The other statistical methods can be applied in Jupyter Notebook or R Studio. - This capability not only allows users to delineate and explore potential new cell subset types, but also enables single cell data sets to be viewed from different dimensions beyond pre-set or pre-conceived cell marker paradigms, potentially fostering innovative viewpoints and new hypotheses.. - The annotated data will be displayed in the web application.. - The following Python API functions (Table 1) were de- signed to retrieve data from the Single Cell Explorer data- base. - clusterName is the annotated cell type. - The API can be used to compute differentially expressed genes, or for other bioinformatics. - To our best knowledge, no other single cell sequencing software currently provides reanalysis capabilities that in- clude drawing, annotation, saving the results in a database, and integration with Jupyter notebook for more complex analyses. - Cellxgene [10] is a Python-based interactive data visualization tool for single-cell transcriptomic datasets, but it focuses on well curated single data sets without compre- hensive database support. - We developed Single Cell Explorer, a Python-based plat- form which promotes a collaborative data sharing experi- ence for single cell transcriptomic data. - getAllClstrsByClstrsType retrieve a table of cell barcodes and annotated cell types in a specific map. - getNormalizedGeneExpr get normalized counts matrix for genes of interest from specific cell types in a specific map getAllNormalizedGeneExpr get full normalized gene counts matrix from specific cell types in specific map. - exportAllClstrsByClstrsType export the cell barcodes and annotated cell types of the cells from a specific map into a csv file. - CEW participated in the design, test, and feedback of requirement specification.. - JH produced user manual video and contributed in the manuscript revision.. - Availability of data and materials Project name: Single Cell Explorer. - The human cell atlas bone marrow single-cell interactive web portal. - iS-CellR: a user-friendly tool for analyzing and visualizing single- cell RNA sequencing data. - ASAP: a web- based platform for the analysis and interactive visualization of single-cell RNA-seq data. - Integrating single-cell transcriptomic data across different conditions, technologies, and species.. - BioJupies: automated generation of interactive notebooks for RNA-Seq data analysis in the cloud. - SCANPY: large-scale single-cell gene expression data analysis. - VASC: dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. - Single-cell reconstruction of the early maternal-fetal interface in humans. - Bias, robustness and scalability in single-cell differential expression analysis
Xem thử không khả dụng, vui lòng xem tại trang nguồn hoặc xem
Tóm tắt