The power of structural information for informing biological
mechanisms
is clear for stable folded macromolecules, but similar structure–function
insight is more difficult to obtain for highly dynamic systems such
as intrinsically disordered proteins (IDPs) which must be described
as structural ensembles. Here, we present IDPConformerGenerator, a
flexible, modular open-source software platform for generating large
and diverse ensembles of disordered protein states that builds conformers
that obey geometric, steric, and other physical restraints on the
input sequence. IDPConformerGenerator samples backbone phi (φ),
psi (ψ), and omega (ω) torsion angles of relevant sequence
fragments from loops and secondary structure elements extracted from
folded protein structures in the RCSB Protein Data Bank and builds
side chains from robust Monte Carlo algorithms using expanded rotamer
libraries. IDPConformerGenerator has many user-defined options enabling
variable fractional sampling of secondary structures, supports Bayesian
models for assessing the agreement of IDP ensembles for consistency
with experimental data, and introduces a machine learning approach
to transform between internal and Cartesian coordinates with reduced
error. IDPConformerGenerator will facilitate the characterization
of disordered proteins to ultimately provide structural insights into
these states that have key biological functions.