One of the major design issues in machine learning (ML) models for materials property prediction(MPP) is how to enable the models to learn property related physicochemical features. While many composition...
Interactions between human leukocyte antigens (HLAs) and peptides play a critical role in the human immune system. Accurate computational prediction of HLA-binding peptides can be used for peptide drug discovery. Currently, the best prediction algorithms are neural network-based pan-specific models, which take advantage of the large amount of data across HLA alleles. However, current pan-specific models are all based on the pseudo sequence encoding for modeling the binding context, which is based on 34 positions identified from the HLA protein-peptide bound structures in early works. In this work, we proposed a novel deep convolutional neural network model (DCNN) for HLA-peptide binding prediction, in which the encoding of the HLA sequence and the binding context are both learned by the network itself without requiring the HLA-peptide bound structure information. Our DCNN model is also characterized by its binding context extraction layer and dual outputs with both binding affinity output and binding probability outputs. Evaluation on public benchmark datasets shows that our DeepSeqPan model without HLA structural information in training achieves state-of-the-art performance on a large number of HLA alleles with good generalization capability. Since our model only needs raw sequences from the HLA-peptide binding pairs, it can be applied to binding predictions of HLAs without structure information and can also be applied to other protein binding problems such as protein-DNA and protein-RNA bindings. The implementation code and trained models are freely available at https://github.com/pcpLiu/DeepSeqPan.
Accurate prediction of peptide binding affinity to the major histocompatibility complex (MHC) proteins has the potential to design better therapeutic vaccines. Previous work has shown that pan-specific prediction algorithms can achieve better prediction performance than other approaches. However, most of the top algorithms are neural networks based black box models. Here, we propose DeepAttentionPan, an improved pan-specific model, based on convolutional neural networks and attention mechanisms for more flexible, stable and interpretable MHC-I binding prediction.With the attention mechanism, our ensemble model consisting of 20 trained networks achieves high and more stabilized prediction performance. Extensive tests on IEDB's weekly benchmark dataset show that our method achieves state-of-the-art prediction performance on 21 test allele datasets. Analysis of the peptide positional attention weights learned by our model demonstrates its capability to capture critical binding positions of the peptides, which leads to mechanistic understanding of MHCpeptide binding with high alignment with experimentally verified results. Furthermore, we show that with transfer learning, our pan model can be fine-tuned for alleles with few samples to achieve additional performance improvement. Deep-AttentionPan is freely available as an open-source software at https://github.com/ jjin49/DeepAttentionPan.
SiC f -SiC m composites are being actively developed as fuel cladding for improving accident tolerance of light water reactor fuel. Online monitoring of the degradation process in SiC f -SiC m composites is of great importance to ensure the safety of the nuclear reactor system. The degradation monitoring task can be mapped as a classification problem: given the Acoustic Emission(AE) events at a given timeslot, the model is expected to predict which one of the following three stages the material is in: elastic, matrix-driven and fiber-driven cracking. In this paper, degradation tests on SiC f -SiC m composite tubes were conducted using a bladder-based internal pressure technique with AE monitoring. We then trained a deep learning based endto-end convolutional neural network (CNN) model for online monitoring of the damage progression process of SiC f -SiC m composite tubes using the AE data as the raw input. As a comparison, we also applied Random Forest (RF) with expert-crafted audio event features to the damage stage prediction problem. Experimental results show that both RF and CNN models yield good results but on average our end-to-end CNN models outperform the RF models due to its high-level feature extraction capability. The CNN model with single events can reach an average prediction accuracy of 84.4% compared to 74% of the RF models. Combining multiple audio samples typically improves the accuracy of the models with RF accuracy reaching 82.8% and CNN accuracy reaching 86.6%.INDEX TERMS Acoustic emission (AE), convolutional neural network (CNN), deep learning, online damage monitoring, random forest (RF), SiC composites.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.