Coarse-grained
(CG) models of biomolecules have been widely used
in protein/ribonucleic acid (RNA) three-dimensional structure prediction,
docking, drug design, and molecular simulations due to their superiority
in computational efficiency. Most of these applications strongly depend
on the reasonable estimation of solvation free energy, which requires
the accurate calculation of solvent accessible surface area (SASA).
Although algorithms for SASA calculations with all-atom protein and
RNA structures have been well-established, accurately estimating the
SASA based on CG structures is extremely challenging. In this work,
we developed a deep learning-based SASA estimator (DeepCGSA), which
can provide almost perfect SASA estimation based on CG structures
of protein and RNA molecules. Extensive testing analysis showed that
for three types of widely used CG protein models, including the Cα-based,
Cα–Cβ, and Martini models, the correlation coefficients
between the predicted values and the reference values can be as high
as 0.95–0.99, which perform dramatically better than available
methods. In addition, the new method can be used for CG RNA structures
and unfolded protein structures with much improved accuracy. We anticipate
that DeepCGSA will be highly useful in the protein/RNA structure prediction,
drug design, and other applications, in which accurate estimations
of SASA for CG biomolecular structures are critically important.