Experimental chemical shifts (CS) from solution and solid state magic-angle-spinning nuclear magnetic resonance (NMR) spectra provide atomic level information for each amino acid within a protein or protein complex. However, structure determination of large complexes and assemblies based on NMR data alone remains challenging due to the complexity of the calculations. Here, we present a hardware accelerated strategy for the estimation of NMR chemical-shifts of large macromolecular complexes based on the previously published PPM_One software. The original code was not viable for computing large complexes, with our largest dataset taking approximately 14 hours to complete. Our results show that serial code refactoring and parallel acceleration brought down the time taken of the software running on an NVIDIA Volta 100 (V100) Graphic Processing Unit (GPU) to 46.71 seconds for our largest dataset of 11.3 million atoms. We use OpenACC, a directive-based programming model for porting the application to a heterogeneous system consisting of x86 processors and NVIDIA GPUs. Finally, we demonstrate the feasibility of our approach in systems of increasing complexity ranging from 100K to 11.3M atoms.
Motivation The application of machine learning (ML) techniques in the medical field have demonstrated both successes and challenges in the precision medicine era. The ability to accurately classify a subject as a potential responder versus a non-responder to a given therapy is still an active area of research pushing the field to create new approaches for applying machine learning techniques. In this study we leveraged publicly available data through the BeatAML initiative. Specifically, we used gene count data, generated via RNA-seq, from 451 individuals matched with ex vivo data generated from treatment with RTK-type-III inhibitors. Three feature selection techniques were tested, Principal Component Analysis (PCA), Shapley Additive Explanation technique (SHAP), and differential gene expression analysis (DGE), with three different classifiers, XGBoost, LightGBM, and Random Forest. Sensitivity versus specificity was analyzed using the area under the curve (AUC) - receiver operating curves (ROC) for every model developed. Results Our work demonstrated that feature selection technique, rather than the classifier, had the greatest impact on model performance. The SHAP technique outperformed the other feature selection techniques and was able to with high accuracy predict outcome response, with the highest performing model: Foretinib with 89% AUC using the SHAP technique and Random Forest classifier. Our ML pipelines demonstrate that at the time of diagnosis, a transcriptomics signature exists that can potentially predict response to treatment, demonstrating the potential of using ML applications in precision medicine efforts. Availability and implementation https://github.com/UD-CRPL/RCDML Supplementary information Supplementary data are available at Bioinformatics online.
addition, we show that the background current from common matrices, such as urine, blood, and faeces contributes little to background plasmonic current in plasmonic immunoassays, which is in stark contrast to traditional fluorescence assays that reply on optical based detection.
Experimental chemical shifts (CS) from solution and solid state magic-angle-spinning nuclear magnetic resonance spectra provide atomic level information for each amino acid within a protein or protein complex. However, structure determination of large complexes and assemblies based on NMR data alone remains challenging due the complexity of the calculations. Here, we present a hardware accelerated strategy for the estimation of NMR chemical-shifts of large macromolecular complexes based on the previously published PPM One software. The original code was not viable for computing large complexes, with our largest dataset taking approximately 14 hours to complete. Our results show that the code refactoring and acceleration brought down the time taken of the software running on an NVIDIA V100 GPU to 46.71 seconds for our largest dataset of 11.3M atoms. We use OpenACC, a directive-based programming model for porting the application to a heterogeneous system consisting of x86 processors and NVIDIA GPUs. Finally, we demonstrate the feasibility of our approach in systems of increasing complexity ranging from 100K to 11.3M atoms. Author summaryIntroduction 1 Computing architectures are ever-evolving. As these architectures become increasingly 2 complex, we need better software stacks that will help us seamlessly port real-world 3 scientific applications to these emerging architectures. It is also important to prepare 4 applications such that they can be readily retargeted to existing and future systems 5 without the need for drastic changes to the code itself. In an ideal world, we are looking 6 for solutions to create a performance productive software. However, this is not easy and 7 is sometimes an impossible task to accomplish. 8Programming and optimizing for different architectures at a minimum often require 9 codes written in different programming languages. This presents an inherent difficulty 10 for software developers as they would need to develop and maintain an entire secondary 11 code base. For this reason, it is ideal to have a single programming standard that is 12 both portable to all architectures and maintains high performance. There are three 13 main reasons why this is difficult: (1) Sufficient parallelism is not exposed to hardware 14 architecture if the algorithm is structured in a way that it limits the level of 15 January 12, 2020 1/11 concurrency, (2) Features in a programming model are often hardware-facing and only 16 occasionally application/user-facing, and (3) to encompass different applications from 17 different fields of study would require the programming standard to have many levels of 18 abstraction in a sensible way. 19There are currently three widely accepted solutions that software developers adapt 20 create performance portable applications: libraries, languages, and directives. Libraries 21 suffer from an inherent scope problem; they can only solve a specific subset of problems 22 and are only designed for a specific subset of architectures. Languages are flawed 23 because of the reas...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with đŸ’™ for researchers
Part of the Research Solutions Family.