The 3D structure of a protein is closely related to its function, and the similarity analysis between their structures can help reveal the function of proteins. However, there exist two problems arising from the analysis of 3D structures of proteins. The proteins with a similar sequence may have different structures, while the proteins with a similar structure may have different sequences. In the analysis of similarity in 3D structures of proteins, it remains difficult for the traditional methods using the spatial feature distribution and geometry or topology features of proteins to solve these problems. In this paper, a Tile-CNN network is proposed to analyze the similarity of proteins in 3D structure. In order to capture the overall and the local features as exhibited by the 3D structures of proteins, it projects 3D protein models into 2D protein images from different views and then cuts these 2D projected images using the tile strategy. After the training of proteins with these images in the Tile-CNN, the test protein model can be expressed by an analysis matrix, and then the similarity between 3D structures of proteins is computed using the root mean square distance (RMSD) for the benchmark matrix and the analysis matrix. As revealed by the experimental results, the proposed algorithm is more robust in analyzing the similarity of 3D structures of proteins and produces a satisfactory performance in solving the two aforementioned problems.
DNA, or deoxyribonucleic acid, is a powerful molecule that plays a fundamental role in the storing and processing genetic information of all living organisms. In recent years, scientists over the world have devoted to taking advantage of its high density, energy efficiency and long durability to solve the challenges in information technology. Here, we propose to build an instance-based learning model by DNA molecules. The handwriting digit images in MNIST dataset are encoded by DNA sequences using a deep learning encoder. And the reversal complementary sequence of a query image is used to hybridize with the training instance sequences. Simulation results by NUPACK show that this classification model by DNA could achieve 95% accuracy on average. Wet-lab experiments also validate the predicted yield is consistent with the hybridization strength. Our work proves that it is feasible to build an effective instance-based classification model for practical application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.