2019
DOI: 10.3390/data4010045
|View full text |Cite
|
Sign up to set email alerts
|

T4SS Effector Protein Prediction with Deep Learning

Abstract: Extensive research has been carried out on bacterial secretion systems, as they can pass effector proteins directly into the cytoplasm of host cells. The correct prediction of type IV protein effectors secreted by T4SS is important, since they are known to play a noteworthy role in various human pathogens. Studies on predicting T4SS effectors involve traditional machine learning algorithms. In this work we included a deep learning architecture, i.e., a Convolutional Neural Network (CNN), to predict IVA and IVB… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…There have been four common variations of DNNs, including the CNNs, the RNNs, the CNN-RNNs, and the DNNs. The CNNs have outstanding spatial information analysis capabilities and have been successfully applied in the prediction of secreted effectors ( Xue et al, 2018 , 2019 ; Açıcı et al, 2019 ), protein solubility ( Khurana et al, 2018 ), and crystallization ( Elbasir et al, 2019 ). Compared to CNNs, RNNs can handle sequential inputs effectively and recognize sequence motifs of varying length extraordinarily well, making them the preferred choice for machine translation, text generation, and image captioning ( Esteva et al, 2019 ).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…There have been four common variations of DNNs, including the CNNs, the RNNs, the CNN-RNNs, and the DNNs. The CNNs have outstanding spatial information analysis capabilities and have been successfully applied in the prediction of secreted effectors ( Xue et al, 2018 , 2019 ; Açıcı et al, 2019 ), protein solubility ( Khurana et al, 2018 ), and crystallization ( Elbasir et al, 2019 ). Compared to CNNs, RNNs can handle sequential inputs effectively and recognize sequence motifs of varying length extraordinarily well, making them the preferred choice for machine translation, text generation, and image captioning ( Esteva et al, 2019 ).…”
Section: Methodsmentioning
confidence: 99%
“…Over the past decade, dozens of machine learning-based computational approaches have been proposed to identify different types of secreted effectors ( Zeng and Zou, 2019 ), including support vector machine (SVM) ( Samudrala et al, 2009 ; Yang et al, 2010 ; Wang et al, 2011 , 2014 , 2017 ; Dong et al, 2013 ; Zou et al, 2013 ; Goldberg et al, 2016 ; Esna Ashari et al, 2019a , b ), random forest (RF) ( Yang et al, 2013 ), artificial neural network (ANN) ( Löwer and Schneider, 2009 ), naive Bayes (NB) ( Arnold et al, 2009 ), hidden Markov model (HMM) ( Xu et al, 2010 ; Lifshitz et al, 2013 ; Wang et al, 2013 ), logistic regression (LR) ( Esna Ashari et al, 2018 ), decision tree (DT) ( Wang et al, 2019a ), gradient boosting ( Chen et al, 2020 ), deep learning (DL) ( Xue et al, 2018 , 2019 ; Açıcı et al, 2019 ; Fu and Yang, 2019 ; Hong et al, 2020 ; Li et al, 2020a ), and their ensemble methods ( Burstein et al, 2009 ; Hobbs et al, 2016 ; Wang et al, 2018 , 2019b ; Xiong et al, 2018 ; Li et al, 2020b ). Some of these methods have achieved relatively high predictive accuracy, while they can recognize only one type of secreted effector, such as SIEVE ( Samudrala et al, 2009 ), EffectiveT3 ( Arnold et al, 2009 ), T3_MM ( Wang et al, 2013 ), GenSET ( Hobbs et al, 2016 ), Bastion3 ( Wang et al, 2019a ), DeepT3 ( Xue et al, 2019 ), WEDeepT3 ( Fu and Yang, 2019 ), ACNNT3 ( Li et al, 2020a ), and EP3 ( Li et al, 2020b ) for T3SEs; T4EffPred ( Zou et al, 2013 ), T4SEpre ( Wang et al, 2014 ), DeepT4 ( Xue et al, 2018 ), PredT4SE-Stack ( Xiong et al, 2018 ), Bastion4 ( Wang et al, 2019b ), T4SE-XGB ( Chen et al, 2020 ), and CNN-T4SE ( Hong et al, 2020 ) for T4SEs; and Bastion6 ( Wang et al, 2018 ) for T6SEs.…”
Section: Introductionmentioning
confidence: 99%
“…Convolutional neural network (CNN) architecture is a deep learning approach commonly used in theoretical and practical studies on topics such as disease classification, MRI reconstruction, and effector protein prediction. [ 11 - 13 ] A common CNN architecture consists of input, convolutional, pooling, and fully connected layers. Additionally, batch normalization, rectified linear unit, and dropout layers can also be used to speed up the process and reduce overfitting.…”
Section: Methodsmentioning
confidence: 99%
“…Instead, a large number of computational methods have been developed for prediction of T4SEs in the last decade, which successfully speed up the process in terms of time and efficiency. These computational approaches can be categorized into two main groups: the first group of approaches infer new effectors based on sequence similarity with currently known effectors (Chen et al, 2010;Lockwood et al, 2011;Marchesini et al, 2011;Meyer et al, 2013;Sankarasubramanian et al, 2016;Noroy et al, 2019) or phylogenetic profiling analysis (Zalguizuri et al, 2019), and the second group of approaches involve learning the patterns of known secreted effectors that distinguish them from nonsecreted proteins based on machine learning and deep learning techniques (Burstein et al, 2009;Lifshitz et al, 2013;Zou et al, 2013;Wang et al, 2014;Ashari et al, 2017;Wang Y. et al, 2017;Esna Ashari et al, 2018Guo et al, 2018;Xiong et al, 2018;Xue et al, 2018;Acici et al, 2019;Chao et al, 2019;Hong et al, 2019;Wang J. et al, 2019;Yan et al, 2020). In the latter group of methods, Burstein et al (2009) worked on Legionella pneumophila to identify T4SEs and validated 40 novel effectors which were predicted by machine learning algorithms.…”
Section: Introductionmentioning
confidence: 99%
“…However, only few information of protein sequences can be extracted, which showed a slightly weaker performance compared with the Bastion4 (Wang J. et al, 2019). Later, Acici et al (2019) developed the CNN-based model based on the conversion from protein sequences to images using AAC, DPC and PSSM feature extraction methods. More recently, Hong et al (2019) developed the new tool CNN-T4SE based on CNN, which integrated three encoding strategies: PSSM, protein secondary structure & solvent accessibility (PSSSA) and one-hot encoding scheme (Onehot), respectively.…”
Section: Introductionmentioning
confidence: 99%