A common approach to scene understanding generates a set of structural hypotheses and evaluates these hypotheses using visual features that are easy to detect. However, these features may not necessarily be the most informative features to discriminate among the hypotheses. This paper demonstrates that by focusing attention on regions where the hypotheses differ in how they explain the visual features, we can then evaluate those hypotheses more efficiently. We define the informativeness of each feature based on the expected information gain that the feature provides to the current set of hypotheses, and demonstrate how these informative features can be selected efficiently. We evaluate our attention focusing method on a Bayesian filter-based approach to scene understanding. Our experimental results demonstrate that by focusing attention on the most informative point features, the Bayesian filter converges to a single hypothesis more efficiently, with no loss of accuracy.