The overall quality of the experimentally determined structures contained in the PDB is exceptionally high, mainly due to the continuous improvement of model building and structural validation programs. Improving reproducibility on a large scale requires expanding the concept of validation in structural biology and all other disciplines to include a broader framework that encompasses the entire project. A successful approach to science requires diligent attention to detail and a focus on the future. An earnest commitment to data availability and reuse is essential for scientific progress, be that by human minds or artificial intelligence. 1.Introduction The Protein Data Bank(PDB) 1) was formed over 50 years ago and initially contained only seven macromolecular structures. The PDB founders realized that access to macromolecular models is essential for crystallographers, students, and researchers who might use structural information for their research. Initially, every deposit contained only the coordinates of the atoms. Even then, every new structure was carefully checked to see whether the 3-D model agreed with known chemical and physical properties. Since 2007, every submission to the PDB has had two additional components: 1)information about the sample (crystal in X-ray crystallography) , a rough description of the experimental setup, and other metadata(header) , and 2)intensity amplitudes(usually called structure factors) of all diffraction spots measured during the experiment. The deposit header also contains additional information like authors, connections to other databases, software used to determine the structure, etc. These three components allow for calculating the electron density map and checking whether the macromolecule model, including ligands, nucleic acids, and solvent, agrees with experimental data. This checking procedure is called validation of the macromolecular model and was usually performed by the experimenter at the end of the structure determination process. The PDB also routinely performs validation on all deposited structures. 2) Moreover, scientists who find disagreement between their experiments and PDB deposits can perform validations for themselves. Sometimes their validation triggers a re-refinement and leads to the deposition of a new, improved model to the PDB. 3)Occasionally, new biomedical interpretations arise from improved structural models. 4)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.