« Home « Kết quả tìm kiếm

Identify RNA-associated subcellular localizations based on multi-label learning using Chou’s 5-steps rule


Tóm tắt Xem thử

- However, lots of existing RNA subcellular localization classifiers only solve the problem of single-label classification.
- It is of great practical significance to expand RNA subcellular localization into multi-label classification problem..
- Results: In this study, we extract multi-label classification datasets about RNA-associated subcellular localizations on various types of RNAs, and then construct subcellular localization datasets on four RNA categories.
- In order to study Homo sapiens, we further establish human RNA subcellular localization datasets.
- The optimal combined kernel can be put into an integration support vector machine model for identifying multi-label RNA subcellular localizations.
- Keywords: RNA subcellular localization, Multi-label classification, Hilbert-Schmidt independence criterion, Multiple kernel learning, Web server.
- protein subcellular localization [1–6].
- [33] built a database called RNALocate, which collected more than 42,000 manually engineered RNA subcellular localization entries.
- [34] constructed a database named LncATLAS to store the subcellular localization of lncRNA.
- [40] devel- oped lncLocator to predict the subcellular localization of long-stranded non-coding RNA.
- [41] proposed a novel method used the sequence-to-sequence model to predict microRNA subcellular localization.
- [42] developed MiRGOFS being a GO-based functional similarity measurement for miRNA subcellular localization.
- have been used to predict subcellular localization with good results..
- However, most existing RNA subcellular localization classifiers only solve the problem of single-label classifica- tion.
- Therefore, it is of great practical significance to expand RNA subcellular localiza- tion into multi-label classification problem.
- In view of the above research, there is no multi-label RNA subcellular localization dataset available for this task.
- The optimal combined kernel can be put into an integration support vector machine model for training a multi-label RNA subcellular localization classifier.
- (3) achieve a major challenge is to fuse the multivariate information through multiple kernel learning based on Hilbert-Schmidt independence criterion, and the optimal combined kernel can be put into an integration support vector machine model for train- ing a multi-label RNA subcellular localization classifier;.
- Here, we compare single-kernel feature models on four RNA sub- cellular localization datasets, as shown in Table 1.
- It can be observed that kmer achieves best performance on mRNAs (AP:0.688) and lncRNAs (AP:0.745), NAC obtains best performance on miRNAs (AP:0.785), and DNC gains best performance on snoRNAs (AP:0.793).
- Details are shown in Additional file 1: Table S5.
- Also, we compare single-kernel feature models on four human RNA sub- cellular localization datasets, as shown in Table 2.
- It can be noticed that kmer achieves best performance on mRNAs (AP:0.750), lncRNAs (AP:0.753), and snoRNAs (AP:0.817), CKSNAP obtains best performance on miR- NAs (AP:0.784).
- Details are shown in Additional file 1:.
- This phenomena is also reflected on four human RNA dataset, as shown in Fig.
- Table 1 Average Precision of seven different nucleotide representations on four RNA datasets.
- Table 2 Average Precision of seven different nucleotide representations on four human RNA datasets.
- Here, we compare five integrated SVM strategies on four RNA subcellular localization datasets, as shown in Table 3.
- It can be observed that MKSVM-HSIC achieves best performance on mRNAs (AP:0.703), lncR- NAs (AP:0.757), miRNAs (AP:0.787), and snoRNAs (AP:0.800).
- Details are shown in Additional file 1: Table S7.
- Also, we compare five integrated SVM strategies on four human RNA subcellular localization datasets, as shown in Table 4.
- It can be observed that MK- HSIC achieves best performance on mRNAs (AP:0.755), lncRNAs (AP:0.754), miRNAs (AP:0.791), and snoRNAs (AP:0.816).
- Details are shown in Additional file 1: Table S8.
- It can be.
- Details are shown in Additional file 1: Table S9.
- 1 Feature importantce scores of seven characteristics on four RNA datasets.
- 2 Feature importantce scores of seven characteristics on four human RNA datasets.
- Here, we compare six classification methods on four RNA subcellular localization datasets, as shown in Table 5.
- It can be observed that MKSVM-HSIC achieves best performance on mRNAs (AP:0.703), lncR- NAs (AP:0.757) and miRNAs (AP:0.787), and XGBT obtains best performance on snoRNAs (AP:0.806).
- Details are shown in Additional file 1: Table S10.
- Also, we com- pare six classification methods on four human RNA sub- cellular localization datasets, as shown in Table 6.
- It can be noticed that MKSVM-HSIC achieves best performance on mRNAs (AP:0.755), lncRNAs (AP:0.754), miRNAs (AP:0.791), and snoRNAs (AP:0.816).
- Details are shown in Additional file 1: Table S11.
- Details are shown in Additional file 1: Table S12..
- Table 3 Average Precision of five different integration strategies on four RNA datasets.
- Table 4 Average Precision of five different integration strategies on four human RNA datasets.
- It will return the possibility of each label for RNA subcellular localization, and also give the suggested labels as final prediction result..
- In this paper, we establish multi-label benchmark data sets for various RNA subcellular localizations to ver- ify prediction tools.
- Furthermore, we design an inte- gration SVM prediction model with one-vs-rest strat- egy to fuse a variety of nucleic acid sequence to iden- tify RNA subcellular localization.
- In this study, we establish RNA subcellular localization datasets, and then propose an integration learning model for multi-label classification.
- In order to study subcellular localization for Homo sapiens, we further establish human RNA subcellular localization datasets.
- We use the database of RNA subcellular localization in order to integrate, analyze and identify RNA subcellular localization for speeding up RNA structural and func- tional researches.
- Table 5 Average Precision of five different classifiers on four RNA datasets.
- Thus, RNALocate pro- vides a comprehensive source of subcellular localization and even insight into the function of hypothetical or new RNAs.
- We extract multi-label classification datasets about RNA-associated subcellular localizations on four RNA categories (mRNAs, lncRNAs, miRNAs and snoRNAs)..
- The flowchart of mRNA subcellular localization dataset construction framework is shown in Fig.
- RNA subcellular localization datasets.
- We extract four RNA subcellular localization datasets, including mRNAs, lncRNAs, miRNA and snoRNAs.
- We delete samples with duplicate Gene ID and remove samples without corresponding subcellular localization labels, and then construct four RNA subcellular localization datasets..
- We count the number of samples for each category of subcellular localization labels, and then select some.
- The statistical distributions of these four RNA datasets are shown in Fig.
- Human RNA subcellular localization datasets.
- We also extract four Homo sapiens RNA subcellular localization datasets, including H_mRNAs, H_lncRNAs, H_miRNA and H_snoRNAs.
- We screen out samples of homo sapiens on above four RNA datasets, and construct four human RNA subcellular localization datasets..
- The statistical distributions of these four human RNA datasets are shown in Fig.
- RNA sequence can be represented as follow: S = (s 1.
- Table 6 Average Precision of five different classifiers on four human RNA datasets.
- 4 The robustness of our novel method on four RNA datasets.
- 5 The robustness of our novel method on four human RNA datasets.
- 6 Schematic diagram of RNA subcellular localizations in cells.
- 7 The flowchart of mRNA subcellular localization dataset construction framework.
- 8 The statistical distributions of four RNA subcellular localization datasets.
- k = 2) descriptor can be calculated as follows..
- 9 The statistical distributions of four human RNA subcellular localization datasets.
- Nucleic acid composition.
- The frequency of each natu- ral nucleic acid (‘A’, ‘C’, ‘G’, ‘T’ or ‘U’) can be calculated as follows..
- The frequency of each 2-tuple of natural nucleic acid can be calculated as follows..
- The frequency of each 3-tuple of natural nucleic acid can be calculated as follows..
- The optimal combinatorial kernel can be calculated as follows..
- Convex quadratic programming problem can be solved as follows..
- NAC: nucleic acid composition.
- Hum-ploc: A novel ensemble classifier for predicting human protein subcellular localization.
- Methodology development for predicting subcellular localization and other attributes of proteins.
- A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mploc 2.0.
- Gram-positive and gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into chou’ s general pseaac..
- ploc-manimal: predict subcellular localization of animal proteins with both single and multiple sites.
- plocbal-mgpos: Predict subcellular localization of gram-positive bacterial proteins by quasi-balancing training dataset and pseaac.
- Rnalocate: a resource for rna subcellular localizations.
- Lncatlas database for subcellular localization of long noncoding rnas..
- The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier.
- https://doi.org/.
- Prediction of microrna subcellular localization by using a sequence-to-sequence model.
- Mirgofs: a go-based functional similarity measurement for mirnas, with applications to the prediction of mirna subcellular localization and mirna–disease association..
- plocdeep-mhum: Predict subcellular localization of human proteins by deep learning.
- plocdeep-mplant: Predict subcellular localization of plant proteins by deep learning.
- plocdeep-mvirus: A cnn model for predicting subcellular localization of virus proteins by deep learning.
- Incorporating organelle correlations into semi-supervised learning for protein subcellular localization prediction..
- Human protein subcellular localization identification via fuzzy model on kernelized neighborhood.
- Identification of protein subcellular localization via integrating evolutionary and physicochemical information into chou’s general pseaac.
- https://doi.org

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt