Kiyoto Aramis Tanemura scite author profile

The interpretation of ion mobility coupled to mass spectrometry (IM-MS) data to predict unknown structures is challenging and depends on accurate theoretical estimates of the molecular ion collision cross section (CCS) against a buffer gas in a low or atmospheric pressure drift chamber. The sensitivity and reliability of computational prediction of CCS values depend on accurately modeling the molecular state over accessible conformations. In this work, we developed an efficient CCS computational workflow using a machine learning model in conjunction with standard DFT methods and CCS calculations. Furthermore, we have performed Traveling Wave IM-MS (TWIMS) experiments to validate the extant experimental values and assess uncertainties in experimentally measured CCS values. The developed workflow yielded accurate structural predictions and provides unique insights into the likely preferred conformation analyzed using IM-MS experiments. The complete workflow makes the computation of CCS values tractable for a large number of conformationally flexible metabolites with complex molecular structures.

show abstract

AutoGraph: Autonomous Graph-Based Clustering of Small-Molecule Conformations

Tanemura

Das

Merz

2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

While accurately modeling the conformational ensemble is required for predicting properties of flexible molecules, the optimal method of obtaining the conformational ensemble seems as varied as their applications. Ensemble structures have been modeledby generation, refinement, and clustering of conformations with a sufficient number of samples. We present a conformational clustering algorithm intended to automate the conformational clustering step through the Louvain algorithm, which requires minimal hyperparameters and importantly no predefined number of clusters or threshold values. The conformational graphs produced by this method for O-succinyl-L-homoserine, oxidized nicotinamide adenine dinucleotide, and 200 representative metabolites each preserved the geometric/energetic correlation expected for points on the potential energy surface. Clustering based on these graphs provide partitions informed by the potential energy surface. Automating conformational clustering in a workflow with AutoGraph may mitigate human biases introduced by guess-and-check over hyperparameter selection while allowing flexibility to the result by not imposing predefined criteria other than optimizing the model's loss function.

show abstract

Python for Chemists

Merz

Tanemura

Sierra-Costa

2021

View full text Add to dashboard Cite

Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection

Hozumi

Tanemura

Wei

2023

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell−cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing the downstream analysis. We present Correlated Clustering and Projection (CCP), a new data-domain dimensionality reduction method, for the first time. CCP projects each cluster of similar genes into a supergene defined as the accumulated pairwise nonlinear gene−gene correlations among all cells. Using 14 benchmark data sets, we demonstrate that CCP has significant advantages over classical principal component analysis (PCA) for clustering and/or classification problems with intrinsically high dimensionality. In addition, we introduce the Residue-Similarity index (RSI) as a novel metric for clustering and classification and the R-S plot as a new visualization tool. We show that the RSI correlates with accuracy without requiring the knowledge of the true labels. The R-S plot provides a unique alternative to the uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) for data with a large number of cell types.

show abstract

Refinement of pairwise potentials via logistic regression to score protein‐protein interactions

2020

View full text Add to dashboard Cite

Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, and then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared with conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. The scripts used are available at https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.