2015
DOI: 10.1007/s10723-015-9353-8
|View full text |Cite
|
Sign up to set email alerts
|

Scaling Ab Initio Predictions of 3D Protein Structures in Microsoft Azure Cloud

Abstract: Computational methods for protein structure prediction allow us to determine a three-dimensional structure of a protein based on its pure amino acid sequence. These methods are a very important alternative to costly and slow experimental methods, like X-ray crystallography or Nuclear Magnetic Resonance. However, conventional calculations of protein structure are time-consuming and require ample computational resources, especially when carried out with the use of ab initio methods that rely on physical forces a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 33 publications
(15 citation statements)
references
References 66 publications
0
15
0
Order By: Relevance
“…For the ROC curve, 1-specificity was plotted on the horizontal axis, and sensitivity on the vertical axis. LOO, K-Fold cross-validation, and independent testing are the most widely used methods for predictor evaluation (Mrozek et al, 2015;Cao and Cheng, 2016;Chen et al, 2017Chen et al, , 2018aChen et al, , 2019bPan et al, 2017;He et al, 2018He et al, , 2019Jiang et al, 2018;Xiong et al, 2018;Yu et al, 2018;Zhang et al, 2018;Ding et al, 2019;Feng et al, 2019;Kong and Zhang, 2019;Li and Liu, 2019;Lv et al, 2019a;Manavalan et al, 2019;Shan et al, 2019;Wang et al, 2019a;Wei et al, 2019a,b;Xu et al, 2019;Yu and Dai, 2019). That is the general machine learning evaluation methods (training, validation and testing) are used for optimized model evaluation.…”
Section: Model Evaluation Metrics and Methodsmentioning
confidence: 99%
“…For the ROC curve, 1-specificity was plotted on the horizontal axis, and sensitivity on the vertical axis. LOO, K-Fold cross-validation, and independent testing are the most widely used methods for predictor evaluation (Mrozek et al, 2015;Cao and Cheng, 2016;Chen et al, 2017Chen et al, , 2018aChen et al, , 2019bPan et al, 2017;He et al, 2018He et al, , 2019Jiang et al, 2018;Xiong et al, 2018;Yu et al, 2018;Zhang et al, 2018;Ding et al, 2019;Feng et al, 2019;Kong and Zhang, 2019;Li and Liu, 2019;Lv et al, 2019a;Manavalan et al, 2019;Shan et al, 2019;Wang et al, 2019a;Wei et al, 2019a,b;Xu et al, 2019;Yu and Dai, 2019). That is the general machine learning evaluation methods (training, validation and testing) are used for optimized model evaluation.…”
Section: Model Evaluation Metrics and Methodsmentioning
confidence: 99%
“…Many hot issues in various sub-fields of bioinformatics were also solved with the use of Big Data ecosystems and Cloud computing, e.g., mapping nextgeneration sequence data to the human genome and other reference genomes, for use in a variety of biological analyzes including SNP discovery, genotyping and personal genomics [65], sequence analysis and assembly [17,30,34,35,47,62], multiple alignments of DNA and RNA sequences [86,91], codon analysis with local MapReduce aggregations [63], NGS data analysis [8], phylogeny [24,48], proteomics [37], analysis of proteinligand binding sites [23], and others. Regarding the analysis of 3D protein structures, it is worth mentioning several works, including Hazelhurst et al [20] and Małysiak-Mrozek et al [46] devoted to exploration of various atomic interactions within protein structures, works of Che-Lun Hung and Yaw-Ling Lin [25], and Mrozek et al [51,53,[55][56][57], devoted to comparison and alignment of 3D protein structures, and cloud-based system for 3D protein structure modeling presented in [54]. However, none of the mentioned works was focused on prediction of disordered regions.…”
Section: Related Workmentioning
confidence: 99%
“…Since deep insight into 3D protein structures is a key for understanding molecular mechanisms of many civilization diseases and for the production of effective drugs, structural genomics tries to determine and describe the 3D structure of every protein that is encoded by a given sequenced genome. This is done by combining traditional experimental methods, like X-ray crystallography or Nuclear Magnetic Resonance (NMR), with computational modeling approaches that use various prediction methods for structure determination [18,54,76,80].…”
Section: Introductionmentioning
confidence: 99%
“…As a result, single invocation of the map function can provide many similarity results (many sets of similarity measures), but each subsequent outcome is related to different pair of compared protein chains. Finally, results of each alignment process, including algorithm-specific similarity measures (like identity, similarity, RMSD, score, and probability), timestamp, and identifier of the candidate protein structure and its chain, are gathered and saved as BSON documents that are stored in the MongoDB database [3] (lines [31][32][33][34][35]. Single entry in the MongoDB database contains the entry id, identifiers of compared proteins and their chains, and the binary BSON document with the outcome of a single alignment.…”
Section: Map Taskmentioning
confidence: 99%
“…Driven by advances in DNA sequencing techniques and the huge gap between the number of known protein sequences [19] and 3D protein structures [18], structural genomics tries to find the 3D structure of every protein that is encoded by a given sequenced genome. This is done by combining traditional experimental methods, like X-ray crystallography or nuclear magnetic resonance (NMR), with modeling approaches that use various prediction methods [10,23,31,44]. These prediction methods, used for protein structure determination, may rely on sequence or structural homology [24,45] to a protein of the structure already determined and stored in a repository, such as the world-renowned Protein Data Bank [1].…”
Section: Introductionmentioning
confidence: 99%