2023
DOI: 10.1093/bib/bbad083
|View full text |Cite
|
Sign up to set email alerts
|

DaDL-SChlo: protein subchloroplast localization prediction based on generative adversarial networks and pre-trained protein language model

Abstract: Chloroplast is a crucial site for photosynthesis in plants. Determining the location and distribution of proteins in subchloroplasts is significant for studying the energy conversion of chloroplasts and regulating the utilization of light energy in crop production. However, the prediction accuracy of the currently developed protein subcellular site predictors is still limited due to the complex protein sequence features and the scarcity of labeled samples. We propose DaDL-SChlo, a multi-location protein subchl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…One possible extension is to collect additional TIPs to develop a more comprehensive prediction model. Another extension could be the employment of well-known feature extractors, such as a bidirectional recurrent neural network (RNN) [ 55 ] and ProtBERT [ 56 ], to effectively capture the key information of TIPs. For the last extension, we can try to incorporate TIPred with recent innovative computational frameworks, such as an iterative feature representation algorithm [ 57 ] and deep learning (DL)-based framework [ 39 , 58 ].…”
Section: Discussionmentioning
confidence: 99%
“…One possible extension is to collect additional TIPs to develop a more comprehensive prediction model. Another extension could be the employment of well-known feature extractors, such as a bidirectional recurrent neural network (RNN) [ 55 ] and ProtBERT [ 56 ], to effectively capture the key information of TIPs. For the last extension, we can try to incorporate TIPred with recent innovative computational frameworks, such as an iterative feature representation algorithm [ 57 ] and deep learning (DL)-based framework [ 39 , 58 ].…”
Section: Discussionmentioning
confidence: 99%
“…In addition, a long short-term memory network (LSTM) which combines the previous states and current inputs is also commonly used [56,57], with Generative Adversarial Network (GAN) [58] and Synthetic Minority Over-sampling Technique (SMOTE) [59] used for synthesizing minority samples to deal with data imbalance. Developing data augmentation methods by deep learning algorithms has also made protein language model construction possible [60,61]. Through transfer learning [62], pretrained models can be fine-tuned on different downstream tasks, reducing the need for large amounts of labeled data for training.…”
Section: Sequences-based Ai Approachesmentioning
confidence: 99%