Effector-GAN: prediction of fungal effector proteins based on pretrained deep representation learning methods and generative adversarial networks

Wang, Yansu; Luo, Ximei; Zou, Quan

doi:10.1093/bioinformatics/btac374

Cited by 10 publications

(5 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep representation learning module can automatically learn global and local features of the sequence ( Wang et al 2021 , Lv et al 2021 , Wang et al 2022 ). Traditional manual feature extraction methods are constructed according to AAC, evolutionary information, sequence information, physical and chemical properties, and so on ( Shi et al 2022 , Li et al 2022 , Zhang and Jing 2023 ).…”

Section: Methodsmentioning

confidence: 99%

AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides

Zhang,

Zhao,

Liang

2024

Bioinformatics

View full text Add to dashboard Cite

Motivation Anticancer peptides (ACPs) have natural cationic properties and can act on the anionic cell membrane of cancer cells to kill cancer cells. Therefore, ACPs have become a potential anticancer drug with good research value and prospect. Results In this paper, we propose AACFlow, an end-to-end model for identification of ACPs based on deep learning. End-to-end models have more room to automatically adjust according to the data, making the overall fit better and reducing error propagation. The combination of attention augmented convolutional neural network (AAConv) and multi-layer convolutional neural network (CNN) forms a deep representation learning module, which is used to obtain global and local information on the sequence. Based on the concept of flow network, multi-head flow-attention mechanism is introduced to mine the deep features of the sequence to improve the efficiency of the model. On the independent test dataset, the ACC, Sn, Sp, and AUC values of AACFlow are 83.9%, 83.0%, 84.8%, and 0.892, respectively, which are 4.9%, 1.5%, 8.0%, and 0.016 higher than those of the baseline model. The MCC value is 67.85%. In addition, we visualize the features extracted by each module to enhance the interpretability of the model. Various experiments show that our model is more competitive in predicting ACPs. Availability The codes and datasets are accessible at https://github.com/z11code/AACFlow. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Methodsmentioning

confidence: 99%

AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides

Zhang,

Zhao,

Liang

2024

Bioinformatics

View full text Add to dashboard Cite

show abstract

“…All metrics were calculated using the Scikit-learn package [ 80 ], and the formulas for computing these metrics were provided in Supplementary Methods. Serious data imbalance is reported to be a significant characteristic of PPI sites datasets, making MCC, F1, and AUPRC the most important and comprehensive indicators as they can emphasize more on the minority class [ 22 , 81 , 82 ].…”

Section: Methodsmentioning

confidence: 99%

“…To convert protein sequences into embeddings, the pretrained protein language model, ProtBERT, was used to generate an L × 1,024 matrix for each protein sequence, where L is the sequence length and each amino acid is represented by a 1,024 embedding vector. ProtBERT is a BERT model pretrained on UniRef100 through self-supervised learning, which can capture biophysical features of protein sequences [ 48 , 82 , 83 ]. The embeddings of proteins were further passed to the 2 base models of EnsemPPIS, namely, TransformerPPIS and GatCNNPPIS.…”

Section: Methodsmentioning

confidence: 99%

A Transformer-Based Ensemble Framework for the Prediction of Protein–Protein Interaction Sites

Mou,

Pan,

Zhou

et al. 2023

Research

View full text Add to dashboard Cite

The identification of protein–protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis .

show abstract

“…MRI predominates in radiogenomics for breast imaging [ 91 ] and has been found to be the most accurate test for finding BC [ 92 , 93 , 94 ]. Yamamoto et al looked at 10 patients who had preoperative dynamic contrast-enhanced (DCE)-MRI and global gene expression data [ 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 ]. The relationship between MRI phenotypes and underlying global BC gene expression patterns was presented using a preliminary radiogenomic association map.…”

Section: Current Application Of Radiogenomics In Oncologymentioning

confidence: 99%

“…Multi-modal analysis has found application across diverse domains including geographical and biomedical image analysis [ 97 , 98 ], video analysis [ 99 , 100 ], and sentiment analysis [ 101 ]. Various methods facilitate co-learning in multi-modal analysis, such as tensor learning [ 102 ], generative models [ 103 ], graphical models [ 104 , 105 ], prior knowledge regularization [ 106 ], multiple kernel learning [ 107 ], and neural networks [ 108 , 109 , 110 ].…”

Section: Current Application Of Radiogenomics In Oncologymentioning

confidence: 99%

The Convergence of Radiology and Genomics: Advancing Breast Cancer Diagnosis with Radiogenomics

Demetriou,

Lockhat,

Brzozowski

et al. 2024

Cancers

View full text Add to dashboard Cite

Despite significant progress in the prevention, screening, diagnosis, prognosis, and therapy of breast cancer (BC), it remains a highly prevalent and life-threatening disease affecting millions worldwide. Molecular subtyping of BC is crucial for predictive and prognostic purposes due to the diverse clinical behaviors observed across various types. The molecular heterogeneity of BC poses uncertainties in its impact on diagnosis, prognosis, and treatment. Numerous studies have highlighted genetic and environmental differences between patients from different geographic regions, emphasizing the need for localized research. International studies have revealed that patients with African heritage are often diagnosed at a more advanced stage and exhibit poorer responses to treatment and lower survival rates. Despite these global findings, there is a dearth of in-depth studies focusing on communities in the African region. Early diagnosis and timely treatment are paramount to improving survival rates. In this context, radiogenomics emerges as a promising field within precision medicine. By associating genetic patterns with image attributes or features, radiogenomics has the potential to significantly improve early detection, prognosis, and diagnosis. It can provide valuable insights into potential treatment options and predict the likelihood of survival, progression, and relapse. Radiogenomics allows for visual features and genetic marker linkage that promises to eliminate the need for biopsy and sequencing. The application of radiogenomics not only contributes to advancing precision oncology and individualized patient treatment but also streamlines clinical workflows. This review aims to delve into the theoretical underpinnings of radiogenomics and explore its practical applications in the diagnosis, management, and treatment of BC and to put radiogenomics on a path towards fully integrated diagnostics.

show abstract

Effector-GAN: prediction of fungal effector proteins based on pretrained deep representation learning methods and generative adversarial networks

Cited by 10 publications

References 31 publications

AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides

AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides

A Transformer-Based Ensemble Framework for the Prediction of Protein–Protein Interaction Sites

The Convergence of Radiology and Genomics: Advancing Breast Cancer Diagnosis with Radiogenomics

Contact Info

Product

Resources

About