Background
Compound–protein interaction prediction is necessary to investigate health regulatory functions and promotes drug discovery. Machine learning is becoming increasingly important in bioinformatics for applications such as analyzing protein-related data to achieve successful solutions. Modeling the properties and functions of proteins is important but challenging, especially when dealing with predictions of the sequence type.
Result
We propose a method to model compounds and proteins for compound–protein interaction prediction. A graph neural network is used to represent the compounds, and a convolutional layer extended with a bidirectional recurrent neural network framework, Long Short-Term Memory, and Gate Recurrent unit is used for protein sequence vectorization. The convolutional layer captures regulatory protein functions, while the recurrent layer captures long-term dependencies between protein functions, thus improving the accuracy of interaction prediction with compounds. A database of 7000 sets of annotated compound protein interaction, containing 1000 base length proteins is taken into consideration for the implementation. The results indicate that the proposed model performs effectively and can yield satisfactory accuracy regarding compound protein interaction prediction.
Conclusion
The performance of GCRNN is based on the classification accordiong to a binary class of interactions between proteins and compounds The architectural design of GCRNN model comes with the integration of the Bi-Recurrent layer on top of CNN to learn dependencies of motifs on protein sequences and improve the accuracy of the predictions.
With the introduction of electronic medical records (EMRs), it has become possible to accumulate massive amounts of qualitative medical data. As such, EMRs have become increasingly used in clinical decision support systems (CDSSs). While CDSSs aim to reduce medical errors normally occurring in the process of treating patients by physicians, technical maturity and the completeness of CDSSs do not meet standards for medical use yet. As data further accumulates, CDSS algorithms must be continuously updated to allow CDSSs to perform their core functions. Doing so, however, requires extensive time and manpower investments. In current practice, computational systems already perform a wide variety of functions in medical settings to allow medical staff to focus on other tasks. However, no prior research has evaluated the potential effectiveness of future CDSSs nor analyzed possibilities for their further development. In this article, we evaluate CDSS technology with the consideration that medical staff also understand the core functions of such systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.