Computational modeling of protein-DNA complex structures has important implications in biomedical applications such as structure-based, computer aided drug design.A key step in developing methods for accurate modeling of protein-DNA complexes is similarity assessment between models and their reference complex structures.Existing methods primarily rely on distance-based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein-DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance-based metrics for accurate similarity measure of protein-DNA complexes. ComparePD was tested on two datasets of computational models of protein-DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein-DNA complexes, as well as the metrics employed by the community-wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that Com-parePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case.complex similarity assessment, homology modeling, hydrogen bond energy, hydrogen bonds, protein-DNA complex, protein-DNA docking
| INTRODUCTIONKnowledge of protein-DNA complex structures is critical to understanding their roles in important biological processes such as regulation of gene expression. The structures of most protein-DNA complexes, however, remain unsolved due to technical challenges in experimental methods. [1][2][3] To address this issue, in silico prediction of three-dimensional structures of protein-DNA complexes is considered a valuable alternative in applications such as structure-based, computer aided drug discovery. 4,5 Despite efforts by the research community, computational modeling of complex macromolecular interactions remains a challenging problem. [6][7][8][9][10][11]