Prediction of lncRNA–Protein Interactions via the Multiple Information Integration

Chen, Yifan; Fu, Xiquan; Li, Zejun; Zhuo, Linlin

doi:10.3389/fbioe.2021.647113

Cited by 5 publications

(3 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…After extracting feature information for the full Plant R protein dataset, to eliminate noise and redundant features from the original feature space and reduce overfitting to improve performance, we employ the SVM-RFE + CBR ( Yan and Zhang, 2015 ) algorithm to select the best feature subset. the SVM-RFE + CBR ( Yan and Zhang, 2015 ) algorithm has been successfully applied to many systems biology problems ( Fu et al, 2018 , 2019a , b ; Chen et al, 2021 ). We first use SVM-RFE + CBR to rank all feature vectors and select a set of top-ranked feature vectors, and then, reorganize the selected feature vectors into new and ordered feature vectors.…”

Section: Datasets and Methodsmentioning

confidence: 99%

Prediction of Plant Resistance Proteins Based on Pairwise Energy Content and Stacking Framework

Chen

Li²,

Li³

2022

Front. Plant Sci.

Self Cite

View full text Add to dashboard Cite

Plant resistance proteins (R proteins) recognize effector proteins secreted by pathogenic microorganisms and trigger an immune response against pathogenic microbial infestation. Accurate identification of plant R proteins is an important research topic in plant pathology. Plant R protein prediction has achieved many research results. Recently, some machine learning-based methods have emerged to identify plant R proteins. Still, most of them only rely on protein sequence features, which ignore inter-amino acid features, thus limiting the further improvement of plant R protein prediction performance. In this manuscript, we propose a method called StackRPred to predict plant R proteins. Specifically, the StackRPred first obtains plant R protein feature information from the pairwise energy content of residues; then, the obtained feature information is fed into the stacking framework for training to construct a prediction model for plant R proteins. The results of both the five-fold cross-validation and independent test validation show that our proposed method outperforms other state-of-the-art methods, indicating that StackRPred is an effective tool for predicting plant R proteins. It is expected to bring some favorable contribution to the study of plant R proteins.

show abstract

Section: Datasets and Methodsmentioning

confidence: 99%

Prediction of Plant Resistance Proteins Based on Pairwise Energy Content and Stacking Framework

Chen

Li²,

Li³

2022

Front. Plant Sci.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Studies on lncRNA-miRNA interactions generally fall under two categories, namely, bioinformatics-based machine learning methods and similarity network-based methods (Liu et al, 2017(Liu et al, , 2020Peng et al, 2017;Zeng et al, 2017Zeng et al, , 2018Zeng et al, , 2019Zhao et al, 2020;Chen et al, 2021;Singh et al, 2021;Wang et al, 2021;Zhou et al, 2021;Zhu et al, 2021). The former extracts biological features and trains models to obtain dichotomous results (i.e., the output is whether lncRNA and miRNA interact) (Intell, 2019;Li J. et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

MILNP: Plant lncRNA–miRNA Interaction Prediction Based on Improved Linear Neighborhood Similarity and Label Propagation

et al. 2022

Self Cite

View full text Add to dashboard Cite

Knowledge of the interactions between long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) is the basis of understanding various biological activities and designing new drugs. Previous computational methods for predicting lncRNA–miRNA interactions lacked for plants, and they suffer from various limitations that affect the prediction accuracy and their applicability. Research on plant lncRNA–miRNA interactions is still in its infancy. In this paper, we propose an accurate predictor, MILNP, for predicting plant lncRNA–miRNA interactions based on improved linear neighborhood similarity measurement and linear neighborhood propagation algorithm. Specifically, we propose a novel similarity measure based on linear neighborhood similarity from multiple similarity profiles of lncRNAs and miRNAs and derive more precise neighborhood ranges so as to escape the limits of the existing methods. We then simultaneously update the lncRNA–miRNA interactions predicted from both similarity matrices based on label propagation. We comprehensively evaluate MILNP on the latest plant lncRNA-miRNA interaction benchmark datasets. The results demonstrate the superior performance of MILNP than the most up-to-date methods. What’s more, MILNP can be leveraged for isolated plant lncRNAs (or miRNAs). Case studies suggest that MILNP can identify novel plant lncRNA–miRNA interactions, which are confirmed by classical tools. The implementation is available on https://github.com/HerSwain/gra/tree/MILNP.

show abstract

“…Machine learning-based LPI inference methods characterized the biological features of lncRNAs and proteins and exploited machine learning algorithms to probe LPI candidates [ 22 ]. Machine learning-based LPI prediction methods contain matrix factorization techniques and ensemble learning techniques [ 23 ].…”

Section: Introductionmentioning

confidence: 99%

LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification

et al. 2021

View full text Add to dashboard Cite

Background Long noncoding RNAs (lncRNAs) play important roles in various biological and pathological processes. Discovery of lncRNA–protein interactions (LPIs) contributes to understand the biological functions and mechanisms of lncRNAs. Although wet experiments find a few interactions between lncRNAs and proteins, experimental techniques are costly and time-consuming. Therefore, computational methods are increasingly exploited to uncover the possible associations. However, existing computational methods have several limitations. First, majority of them were measured based on one simple dataset, which may result in the prediction bias. Second, few of them are applied to identify relevant data for new lncRNAs (or proteins). Finally, they failed to utilize diverse biological information of lncRNAs and proteins. Results Under the feed-forward deep architecture based on gradient boosting decision trees (LPI-deepGBDT), this work focuses on classify unobserved LPIs. First, three human LPI datasets and two plant LPI datasets are arranged. Second, the biological features of lncRNAs and proteins are extracted by Pyfeat and BioProt, respectively. Thirdly, the features are dimensionally reduced and concatenated as a vector to represent an lncRNA–protein pair. Finally, a deep architecture composed of forward mappings and inverse mappings is developed to predict underlying linkages between lncRNAs and proteins. LPI-deepGBDT is compared with five classical LPI prediction models (LPI-BLS, LPI-CatBoost, PLIPCOM, LPI-SKF, and LPI-HNM) under three cross validations on lncRNAs, proteins, lncRNA–protein pairs, respectively. It obtains the best average AUC and AUPR values under the majority of situations, significantly outperforming other five LPI identification methods. That is, AUCs computed by LPI-deepGBDT are 0.8321, 0.6815, and 0.9073, respectively and AUPRs are 0.8095, 0.6771, and 0.8849, respectively. The results demonstrate the powerful classification ability of LPI-deepGBDT. Case study analyses show that there may be interactions between GAS5 and Q15717, RAB30-AS1 and O00425, and LINC-01572 and P35637. Conclusions Integrating ensemble learning and hierarchical distributed representations and building a multiple-layered deep architecture, this work improves LPI prediction performance as well as effectively probes interaction data for new lncRNAs/proteins.

show abstract

Prediction of lncRNA–Protein Interactions via the Multiple Information Integration

Cited by 5 publications

References 69 publications

Prediction of Plant Resistance Proteins Based on Pairwise Energy Content and Stacking Framework

Prediction of Plant Resistance Proteins Based on Pairwise Energy Content and Stacking Framework

MILNP: Plant lncRNA–miRNA Interaction Prediction Based on Improved Linear Neighborhood Similarity and Label Propagation

LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification

Contact Info

Product

Resources

About