Svetlana I. Ovchinnikova scite author profile

Over the years, a number of dimensionality reduction techniques have been proposed and used in chemoinformatics to perform nonlinear mappings. In this study, four representatives of nonlinear dimensionality reduction methods related to two different families were analyzed: distance-based approaches (Isomap and Diffusion Maps) and topology-based approaches (Generative Topographic Mapping (GTM) and Laplacian Eigenmaps). The considered methods were applied for the visualization of three toxicity datasets by using four sets of descriptors. Two methods, GTM and Diffusion Maps, were identified as the best approaches, which thus made it impossible to prioritize a single family of the considered dimensionality reduction methods. The intrinsic dimensionality assessment of data was performed by using the Maximum Likelihood Estimation. It was observed that descriptor sets with a higher intrinsic dimensionality contributed maps of lower quality. A new statistical coefficient, which combines two previously known ones, was proposed to automatically rank the maps. Instead of relying on one of the best methods, we propose to automatically generate maps with different parameter values for different descriptor sets. By following this procedure, the maps with the highest values of the introduced statistical coefficient can be automatically selected and used as a starting point for visual inspection by the user.

show abstract

Impact of distance-based metric learning on classification and visualization model performance and structure–activity landscapes

Kireeva

Ovchinnikova

Кузнецов

et al. 2014

J Comput Aided Mol Des

View full text Add to dashboard Cite

This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Svetlana I. Ovchinnikova

The complexation of metal ions with various organic ligands in water: prediction of stability constants by QSPR ensemble modelling

Nonlinear Dimensionality Reduction for Visualizing Toxicity Data: Distance‐Based Versus Topology‐Based Approaches

Impact of distance-based metric learning on classification and visualization model performance and structure–activity landscapes

Contact Info

Product

Resources

About