Carbohydrate chains are ubiquitous in the complex molecular processes of life. These highly diverse chains are recognized by a variety of protein receptors, enabling glycans to regulate many biological functions. High-resolution structures of protein–glycoligand complexes reveal the atomic details necessary to understand this level of molecular recognition and inform application-focused scientific and engineering pursuits. When experimental challenges hinder high-throughput determination of quality structures, computational tools can, in principle, fill the gap. In this work, we introduce GlycanDock, a residue-centric protein–glycoligand docking refinement algorithm developed within the Rosetta macromolecular modeling and design software suite. We performed a benchmark docking assessment using a set of 109 experimentally determined protein–glycoligand complexes as well as 62 unbound protein structures. The GlycanDock algorithm can sample and discriminate among protein–glycoligand models of native-like structural accuracy with statistical reliability from starting structures of up to 7 Å root-mean-square deviation in the glycoligand ring atoms. We show that GlycanDock-refined models qualitatively replicated the known binding specificity of a bacterial carbohydrate-binding module. Finally, we present a protein–glycoligand docking pipeline for generating putative protein–glycoligand complexes when only the glycoligand sequence and unbound protein structure are known. In combination with other carbohydrate modeling tools, the GlycanDock docking refinement algorithm will accelerate research in the glycosciences.
ContentsPage Fig S1: SymDock2 model of T118 S2 Fig S2: Docking failures in Round 37 S3 Fig S3: Docking failures in Rounds 39 and 42 S4 Fig S4: Post-hoc analysis of T130 S5
Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.
CAPRI Rounds 37 through 45 introduced larger complexes, new macromolecules, and multi-stage assemblies. For these rounds, we used and expanded docking methods in Rosetta to model 23 target complexes. We successfully predicted 14 target complexes and recognized and refined nearnative models generated by other groups for two further targets. Notably, for targets T110 and T136, we achieved the closest prediction of any CAPRI participant. We created several innovative approaches during these rounds. Since Round 39 (target 122), we have used the new RosettaDock 4.0, which has a revamped coarse-grained energy function and the ability to perform conformer selection during docking with hundreds of pre-generated protein backbones. Ten of the complexes had some degree of symmetry in their interactions, so we tested Rosetta SymDock, realized its shortcomings, and developed the next-generation symmetric docking protocol, SymDock2, which includes docking of multiple backbones and induced-fit refinement. Since the last CAPRI assessment, we also developed methods for modeling and designing carbohydrates in Rosetta, and we used them to successfully model oligosaccharide-protein complexes in Round 41. While the results were broadly encouraging, they also highlighted the pressing need to invest in (1) flexible docking algorithms with the ability to model loop and linker motions and in (2) new sampling and scoring methods for oligosaccharide-protein interactions.
Biomolecular structure drives function, and computational capabilities have progressed such that the prediction and computational design of biomolecular structures is increasingly feasible. Because computational biophysics attracts students from many different backgrounds and with different levels of resources, teaching the subject can be challenging. One strategy to teach diverse learners is with interactive multimedia material that promotes self-paced, active learning. We have created a hands-on education strategy with a set of 16 modules that teach topics in biomolecular structure and design, from fundamentals of conformational sampling and energy evaluation to applications, such as protein docking, antibody design, and RNA structure prediction. Our modules are based on PyRosetta, a Python library that encapsulates all computational modules and methods in the Rosetta software package. The workshop-style modules are implemented as Jupyter Notebooks that can be executed in the Google Colaboratory, allowing learners access with just a Web browser. The digital format of Jupyter Notebooks allows us to embed images, molecular visualization movies, and interactive coding exercises. This multimodal approach may better reach students from different disciplines and experience levels, as well as attract more researchers from smaller labs and cognate backgrounds to leverage PyRosetta in science and engineering research. All materials are freely available at https://github.com/RosettaCommons/PyRosetta.notebooks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.