Xiaowei Xu scite author profile

Abstract. In order to reduce human efforts, there has been increasing interest in applying active learning for training text classifiers. This paper describes a straightforward active learning heuristic, representative sampling, which explores the clustering structure of 'uncertain' documents and identifies the representative samples to query the user opinions, for the purpose of speeding up the convergence of Support Vector Machine (SVM) classifiers. Compared with other active learning algorithms, the proposed representative sampling explicitly addresses the problem of selecting more than one unlabeled documents. In an empirical study we compared representative sampling both with random sampling and with SVM active learning. The results demonstrated that representative sampling offers excellent learning performance with fewer labeled documents and thus can reduce human efforts in text classification tasks.

show abstract

Predicting Hepatotoxicity Using ToxCastin VitroBioactivity and Chemical Structure

Liu

Mansouri

Judson

et al. 2015

Chem. Res. Toxicol.

137

119

View full text Add to dashboard Cite

The U.S. Tox21 and EPA ToxCast program screen thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors, then used supervised machine learning to predict in vivo hepatotoxic effects. A set of 677 chemicals was represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PaDEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector machines (SVM), classification and regression trees (CART), k-nearest neighbors (KNN), and an ensemble of these classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure descriptors, ToxCast bioactivity descriptors, and hybrid descriptors. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.84 ± 0.08), injury (0.80 ± 0.09), and proliferative lesions (0.80 ± 0.10). Though chemical and bioactivity classifiers had a similar balanced accuracy, the former were more sensitive, and the latter were more specific. CART, ENSMB, and SVM classifiers performed the best, and nuclear receptor activation and mitochondrial functions were frequently found in highly predictive classifiers of hepatotoxicity. ToxCast and ToxRefDB provide the largest and richest publicly available data sets for mining linkages between the in vitro bioactivity of environmental chemicals and their adverse histopathological outcomes. Our findings demonstrate the utility of high-throughput assays for characterizing rodent hepatotoxicants, the benefit of using hybrid representations that integrate bioactivity and chemical structure, and the need for objective evaluation of classification performance.

show abstract

In silico drug repositioning – what we need to know

Liu

Fang

Kelly

et al. 2013

Drug Discovery Today

164

102

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiaowei Xu

Representative Sampling for Text Classification Using Support Vector Machines

Predicting Hepatotoxicity Using ToxCastin VitroBioactivity and Chemical Structure

In silico drug repositioning – what we need to know

Contact Info

Product

Resources

About