We have developed an automatic approach for homology modeling using restrained molecular dynamics and simulated annealing procedures, together with conformational search algorithms available in the molecular mechanics program CONGEN (Bruccoleri RE, Karplus M, 1987, Biopolymers 26137-168). The accuracy of the method is validated by "predicting" structures of two homeodomain proteins with known three-dimensional structures, and then applied to predict the three-dimensional structure of the homeodomain of the murine Msx-1 transcription factor. Regions of the unknown protein structure that are highly homologous to the known template structure are constrained by "homology distance constraints," whereas the conformations of nonhomologous regions of the unknown protein are defined only by the potential energy function. A full energy function (excluding explicit solvent) is employed to ensure that the calculated structures have good conformational energies and are physically reasonable. As in NMR structure determinations, information on the consistency of the structure prediction is obtained by superposition of the resulting family of protein structures. In this paper, our homology modeling algorithm is described and compared with related homology modeling methods using spatial constraints derived from the structures of homologous proteins. The software is then used to predict the DNA-bound structures of three homeodomain proteins from the X-ray crystal structure of the engrailed homeodomain protein (Kissinger CR et al., 1990, Cell 63:579-590). The resulting backbone and side-chain conformations of the modeled yeast MatcY2 and D. melunogaster Antennapedia homeodomains are excellent matches to the corresponding published X-ray crystal (Wolberger C et al., 1991, Cell 67517-528) and NMR , J Mol Biol234:1084-1097) structures, respectively. Examination of these structures of Msx-1 reveals a network of highly conserved surface salt bridges that are proposed to play a role in regulating protein-protein interactions of homeodomains in transcription complexes. Abbreviations: Antp, homeodomain of the D. melanogusrer Antennapedia protein; DG, distance geometry calculations using metric-mauix embedding methods; Conf. E., total conformational energy including electrostatic effects, computed from the CHARMM potential function; Mata2, homeodomain of the yeast M a t d protein; Msx-1, homeodomain of the murine Msx-1 protein; pdf, probability density function; RMSD, R M S deviation; SARMD, simulated annealing with restrained molecular dynamics; VDW E., van der Waals energy computed from the Lennard-Jones portion of the CHARMM potential function.
Keywords
956Homology modeling of homeodomain Msx-1 with CONGEN
957The homeodomain is a highly conserved sequence-specific DNAbinding domain that has been found in many transcription factors. First discovered in Drosophila melanogaster, homeodomains have been found in almost every organism from nematodes to humans (Kessel & Gruss, 1990; Wang et al., 1993), and have been found to play a fu...