Cyclic peptides have emerged as a promising class of
therapeutics.
However, their de novo design remains challenging,
and many cyclic peptide drugs are simply natural products or their
derivatives. Most cyclic peptides, including the current cyclic peptide
drugs, adopt multiple conformations in water. The ability to characterize
cyclic peptide structural ensembles would greatly aid their rational
design. In a previous pioneering study, our group demonstrated that
using molecular dynamics results to train machine learning models
can efficiently predict structural ensembles of cyclic pentapeptides.
Using this method, which was termed StrEAMM (Structural Ensembles Achieved by Molecular Dynamics and Machine Learning), linear regression models were able
to predict the structural ensembles for an independent test set with R
2 = 0.94 between the predicted populations for
specific structures and the observed populations in molecular dynamics
simulations for cyclic pentapeptides. An underlying assumption in
these StrEAMM models is that cyclic peptide structural preferences
are predominantly influenced by neighboring interactions, namely,
interactions between (1,2) and (1,3) residues. Here we demonstrate
that for larger cyclic peptides such as cyclic hexapeptides, linear
regression models including only (1,2) and (1,3) interactions fail
to produce satisfactory predictions (R
2 = 0.47); further inclusion of (1,4) interactions leads to moderate
improvements (R
2 = 0.75). We show that
when using convolutional neural networks and graph neural networks
to incorporate complex nonlinear interaction patterns, we can achieve R
2 = 0.97 and R
2 =
0.91 for cyclic pentapeptides and hexapeptides, respectively.