More than 100,000 protein structures are now known at atomic detail. However, far more are not yet known, particularly among large or complex proteins. Often, experimental information is only semireliable because it is uncertain, limited, or confusing in important ways. Some experiments give sparse information, some give ambiguous or nonspecific information, and others give uncertain information-where some is right, some is wrong, but we don't know which. We describe a method called Modeling Employing Limited Data (MELD) that can harness such problematic information in a physics-based, Bayesian framework for improved structure determination. We apply MELD to eight proteins of known structure for which such problematic structural data are available, including a sparse NMR dataset, two ambiguous EPR datasets, and four uncertain datasets taken from sequence evolution data. MELD gives excellent structures, indicating its promise for experimental biomolecule structure determination where only semireliable data are available.protein structure | molecular modeling | integrative structural biology | Bayesian inference I ncreasingly, structures are determined using integrative structural biology approaches, where direct experimental data are combined with computer-based models (1). Important successes in integrative structural biology have come from pioneering methods such as Modeler (2, 3), methods based on Rosetta (4-7), and others (8). Atomistic molecular dynamics (MD) simulations can be a powerful tool in integrative structural biology, because they capture physical principles and thermodynamic forcesinformation that is otherwise orthogonal to purely structural observations. However, there remain many situations in which it is not yet possible to properly integrate external knowledge with atomistic MD to infer biomolecular structures. Often, the external knowledge is challenging in one or more of the following ways. (i) Sparse data provide too little information to fully constrain the structure. (ii) Ambiguous data are not very specific, allowing alternative structural interpretations. (iii) Uncertain data cannot be interpreted at face value, because they contain false-positive signals that can be misdirective. Determining new challenging protein structures requires ways to handle semireliable data.Here, we describe a physics-based, Bayesian computational method called MELD (Modeling Employing Limited Data). It is a procedure for making rigorous inferences from limited or uncertain data. We build upon previous Bayesian approaches (9-14), which share the key feature of combining prior belief with the available data to produce statistically consistent samples from a posterior distribution, rather than searching for a single well-scoring model. The key properties of MELD are the rigorous treatment of statistical mechanics, a novel likelihood function that can handle uncertain data, and a graphics processing unit (GPU)-accelerated sampling strategy that makes the calculations tractable.MELD uses free energy as the princi...