2022
DOI: 10.1093/bioinformatics/btac374
|View full text |Cite
|
Sign up to set email alerts
|

Effector-GAN: prediction of fungal effector proteins based on pretrained deep representation learning methods and generative adversarial networks

Abstract: Motivation Phytopathogenic fungi secrete effector proteins to subvert host defenses and facilitate infection. Systematic analysis and prediction of candidate fungal effector proteins is crucial for experimental validation and biological control of plant disease. However, two problems are still considered intractable to be solved in fungal effector prediction: one is the high-level diversity in effector sequences that increases the difficulty of protein feature learning, and the other is the c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 31 publications
0
5
0
Order By: Relevance
“…Deep representation learning module can automatically learn global and local features of the sequence ( Wang et al 2021 , Lv et al 2021 , Wang et al 2022 ). Traditional manual feature extraction methods are constructed according to AAC, evolutionary information, sequence information, physical and chemical properties, and so on ( Shi et al 2022 , Li et al 2022 , Zhang and Jing 2023 ).…”
Section: Methodsmentioning
confidence: 99%
“…Deep representation learning module can automatically learn global and local features of the sequence ( Wang et al 2021 , Lv et al 2021 , Wang et al 2022 ). Traditional manual feature extraction methods are constructed according to AAC, evolutionary information, sequence information, physical and chemical properties, and so on ( Shi et al 2022 , Li et al 2022 , Zhang and Jing 2023 ).…”
Section: Methodsmentioning
confidence: 99%
“…All metrics were calculated using the Scikit-learn package [ 80 ], and the formulas for computing these metrics were provided in Supplementary Methods. Serious data imbalance is reported to be a significant characteristic of PPI sites datasets, making MCC, F1, and AUPRC the most important and comprehensive indicators as they can emphasize more on the minority class [ 22 , 81 , 82 ].…”
Section: Methodsmentioning
confidence: 99%
“…To convert protein sequences into embeddings, the pretrained protein language model, ProtBERT, was used to generate an L × 1,024 matrix for each protein sequence, where L is the sequence length and each amino acid is represented by a 1,024 embedding vector. ProtBERT is a BERT model pretrained on UniRef100 through self-supervised learning, which can capture biophysical features of protein sequences [ 48 , 82 , 83 ]. The embeddings of proteins were further passed to the 2 base models of EnsemPPIS, namely, TransformerPPIS and GatCNNPPIS.…”
Section: Methodsmentioning
confidence: 99%
“…MRI predominates in radiogenomics for breast imaging [ 91 ] and has been found to be the most accurate test for finding BC [ 92 , 93 , 94 ]. Yamamoto et al looked at 10 patients who had preoperative dynamic contrast-enhanced (DCE)-MRI and global gene expression data [ 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 ]. The relationship between MRI phenotypes and underlying global BC gene expression patterns was presented using a preliminary radiogenomic association map.…”
Section: Current Application Of Radiogenomics In Oncologymentioning
confidence: 99%
“…Multi-modal analysis has found application across diverse domains including geographical and biomedical image analysis [ 97 , 98 ], video analysis [ 99 , 100 ], and sentiment analysis [ 101 ]. Various methods facilitate co-learning in multi-modal analysis, such as tensor learning [ 102 ], generative models [ 103 ], graphical models [ 104 , 105 ], prior knowledge regularization [ 106 ], multiple kernel learning [ 107 ], and neural networks [ 108 , 109 , 110 ].…”
Section: Current Application Of Radiogenomics In Oncologymentioning
confidence: 99%