We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike’s full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
SARS-CoV-2 infection is controlled by the opening of the spike protein receptor binding domain (RBD), which transitions from a glycan-shielded (down) to an exposed (up) state in order to bind the human ACE2 receptor and infect cells. While snapshots of the up and down states have been obtained by cryoEM and cryoET, details of the RBD opening transition evade experimental characterization. Here, over 200 μs of weighted ensemble (WE) simulations of the fully glycosylated spike ectodomain allow us to characterize more than 300 continuous, kinetically unbiased RBD opening pathways. Together with biolayer interferometry experiments, we reveal a gating role for the N-glycan at position N343, which facilitates RBD opening. Residues D405, R408, and D427 also participate. The atomic-level characterization of the glycosylated spike activation mechanism provided herein achieves a new high-water mark for ensemble pathway simulations and offers a foundation for understanding the fundamental mechanisms of SARS-CoV-2 viral entry and infection.
The weighted ensemble (WE) strategy has been demonstrated to be highly efficient in generating pathways and rate constants for rare events such as protein folding and protein binding using atomistic molecular dynamics simulations. Here we present five tutorials instructing users in the best practices for preparing, carrying out, and analyzing WE simulations for various applications using the WESTPA software. Users are expected to already have significant experience with running standard molecular dynamics simulations using the underlying dynamics engine of interest (e.g. Amber, Gromacs, OpenMM). The tutorials range from a molecular association process in explicit solvent to more complex processes such as host-guest association, peptide conformational sampling, and protein folding.
The weighted ensemble
(WE) family of methods is one of several
statistical mechanics-based path sampling strategies that can provide
estimates of key observables (rate constants and pathways) using a
fraction of the time required by direct simulation methods such as
molecular dynamics or discrete-state stochastic algorithms. WE methods
oversee numerous parallel trajectories using intermittent overhead
operations at fixed time intervals, enabling facile interoperability
with any dynamics engine. Here, we report on the major upgrades to
the WESTPA software package, an open-source, high-performance framework
that implements both basic and recently developed WE methods. These
upgrades offer substantial improvements over traditional WE methods.
The key features of the new WESTPA 2.0 software enhance the efficiency
and ease of use: an adaptive binning scheme for more efficient surmounting
of large free energy barriers, streamlined handling of large simulation
data sets, exponentially improved analysis of kinetics, and developer-friendly
tools for creating new WE methods, including a Python API and resampler
module for implementing both binned and “binless” WE
strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.