A scientist's choice of research problem affects his or her personal career trajectory. Scientists' combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity's importance corresponds to its degree centrality, and a problem's difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.complex networks | computational biology | science of science | innovation | sociology of science A scientist's choice of research problem directly affects his or her career. Indirectly, it affects the scientific community. A prescient choice can result in a high-impact study. This boosts the scientist's reputation, but it can also create research opportunities across the field. Scientific choices are hard to quantify because of the complexity and dimensionality of the underlying problem space. In formal or computational models, problem spaces are typically encoded as simple choices between a few options (1, 2) or as highly abstract "landscapes" borrowed from evolutionary biology (3-5). The resulting insight about the relationship between research choice and collective efficiency is suggestive, but necessarily qualitative and abstract.We obtain concrete, quantitative insight by representing the growth of knowledge as an evolving network extracted from the literature (2, 6). Nodes in the network are scientific concepts and edges are the relations between them asserted in publications. For example, molecules-a core concept in chemistry, biology, and medicine-may be linked by physical interaction (7) or shared clinical relevance (8). Variations of this network metaphor for knowledge have appeared in philosophy (9), social studies of science (10-12), artific...