Summary Classical structural biology techniques face a great challenge to determine the structure at the atomic level of large and flexible macromolecules. We present a novel methodology that combines high-resolution AFM topographic images with atomic coordinates of proteins to assemble very large macromolecules or particles. Our method uses a two-step protocol: atomic coordinates of individual domains are docked beneath the molecular surface of the large macromolecule, and then each domain is assembled using a combinatorial search. The protocol was validated on three test cases: a simulated system of antibody structures; and two experimentally-based test cases: Tobacco mosaic virus, a rod-shaped virus; and aquaporin Z, a bacterial membrane protein. We have shown that AFM-intermediate resolution topography and partial surface data are useful constraints for building macromolecular assemblies. The protocol is applicable to multi-component structures connected in the polypeptide chain or as disjoint molecules. The approach effectively increases the resolution of AFM beyond topographical information down to atomic-detail structures.
The study of high-resolution topographic surfaces of isolated single molecules is one of the applications of atomic force microscopy (AFM). Since tip-induced distortions are significant in topographic images the exact AFM tip shape must be known in order to correct dilated AFM height images using mathematical morphology operators. In this work, we present a protocol to estimate the AFM tip apex radius using tobacco mosaic virus (TMV) particles. Among the many advantages of TMV, are its non-abrasivity, thermal stability, bio-compatibility with other isolated single molecules and stability when deposited on divalent ion pretreated mica. Compared to previous calibration systems, the advantage of using TMV resides in our detailed knowledge of the atomic structure of the entire rod-shaped particle. This property makes it possible to interpret AFM height images in term of the three-dimensional structure of TMV. Results obtained in this study show that when a low imaging force is used, the tip is sensing viral protein loops whereas at higher imaging force the tip is sensing the TMV particle core. The known size of the TMV particle allowed us to develop a tip-size estimation protocol which permits the successful erosion of tip-convoluted AFM height images. Our data shows that the TMV particle is a well-adapted calibrator for AFM tips for imaging single isolated biomolecules. The procedure developed in this study is easily applicable to any other spherical viral particles.
Accurate and timely pregnancy diagnosis is an important component of effective herd management in dairy cattle. Predicting pregnancy from Fouriertransform mid-infrared (FT-MIR) spectroscopy data is of particular interest because the data are often already available from routine milk testing. The purpose of this study was to evaluate how well pregnancy status could be predicted in a large data set of 1,161,436 FT-MIR milk spectra records from 863,982 mixed-breed pasturebased New Zealand dairy cattle managed within seasonal calving systems. Three strategies were assessed for defining the nonpregnant cows when partitioning the records according to pregnancy status in the training population. Two of these used records for cows with a subsequent calving only, whereas the third also included records for cows without a subsequent calving. For each partitioning strategy, partial least squares discriminant analysis models were developed, whereby spectra from all the cows in 80% of herds were used to train the models, and predictions on cows in the remaining herds were used for validation. A separate data set was also used as a secondary validation, whereby pregnancy diagnosis had been assigned according to the presence of pregnancy-associated glycoproteins (PAG) in the milk samples. We examined different ways of accounting for stage of lactation in the prediction models, either by including it as an effect in the prediction model, or by pre-adjusting spectra before fitting the model. For a subset of strategies, we also assessed prediction accuracies from deep learning approaches, utilizing either the raw spectra or images of spectra. Across all strategies, prediction accuracies were highest for models using the unadjusted spectra as model predictors. Strategies for cows with a subsequent calving performed well in herdindependent validation with sensitivities above 0.79, specificities above 0.91 and area under the receiver operating characteristic curve (AUC) values over 0.91. However, for these strategies, the specificity to predict nonpregnant cows in the external PAG data set was poor (0.002-0.04). The best performing models were those that included records for cows without a subsequent calving, and used unadjusted spectra and days in milk as predictors, with consistent results observed across the training, herd-independent validation and PAG data sets. For the partial least squares discriminant analysis model, sensitivity was 0.71, specificity was 0.54 and AUC values were 0.68 in the PAG data set; and for an image-based deep learning model, the sensitivity was 0.74, specificity was 0.52 and the AUC value was 0.69. Our results demonstrate that in pasture-based seasonal calving herds, confounding between pregnancy status and spectral changes associated with stage of lactation can inflate prediction accuracies. When the effect of this confounding was reduced, prediction accuracies were not sufficiently high enough to use as a sole indicator of pregnancy status.
Protein complexes are involved in many biological processes mediating diverse important cellular functions. The tertiary structures of protein complexes provide a crucial insight about the molecular mechanisms that regulate their functions and assembly. However, solving protein complex structures by experimental methods is often more difficult than single protein structures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.