Accurate high-resolution refinement of protein structure models is a formidable challenge because of the delicate balance of forces in the native state, the difficulty in sampling the very large number of alternative tightly packed conformations, and the inaccuracies in current force fields. Indeed, energy-based refinement of comparative models generally leads to degradation rather than improvement in model quality, and, hence, most current comparative modeling procedures omit physically based refinement. However, despite their inaccuracies, current force fields do contain information that is orthogonal to the evolutionary information on which comparative models are based, and, hence, refinement might be able to improve comparative models if the space that is sampled is restricted sufficiently so that false attractors are avoided. Here, we use the principal components of the variation of backbone structures within a homologous family to define a small number of evolutionarily favored sampling directions and show that model quality can be improved by energy-based optimization along these directions.W ith the progression of structural genomics initiatives (1-3), comparative modeling has become an increasingly important method for building protein structure models (4, 5). After a suitable structure template is chosen, accurate comparative modeling requires a correct alignment between the target protein sequence and the template sequence, an accurate method for modeling the loops (the insertions and deletions in an alignment) and side chains, and, finally, a method for refining the coordinates derived from the template structure toward those of the true native structure (6-8). In this study, we focus on this last model-refinement step. Improvement of the accuracy of comparative models is very important because accurate comparative models potentially can be used for many applications, such as virtual drug scanning (9), molecular replacement (10), and function prediction (11). Refinement is particularly important when the sequence identity between a target protein and the template protein is Ͻ30% (12), because models built by using current methods generally have rms deviations (rmsd) of Ͼ1.5 Å (13).However, high-resolution refinement is as formidable as it is important. This difficulty is due to both the large size of conformational space and the delicate balance of forces in the native state. Indeed, in the recent CASP5 experiment (The 5th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction), most refined structures had larger rmsd to the native structure than the starting template backbone conformation (7). High-resolution refinement is thus a very stringent test of accuracy that perhaps no current force field satisfies.Progress on this very important but very challenging problem may be facilitated by focusing on more constrained and thus more tractable refinement problems. We were led to thinking about such problems by the observation that a refinement protocol that d...