Polycyclic aromatic hydrocarbons (PAHs) are a complex group of environmental contaminants, many having long environmental half-lives. As these compounds degrade, the changes in their structure can result in a substantial increase in mutagenicity compared to the parent compound. Over time, each individual PAH can potentially degrade into several thousand unique transformation products, creating a complex, constantly evolving set of intermediates. Microbial degradation is the primary mechanism of their transformation and ultimate removal from the environment, and this process can result in mutagenic activation similar to the metabolic activation that can occur in multicellular organisms. The diversity of the potential intermediate structures in PAH-contaminated environments renders hazard assessment difficult for both remediation professionals and regulators. A mixture of structural and energetic descriptors has proven effective in existing studies for classifying which PAH transformation products will be mutagenic. However, most existing studies of environmental PAH mutagens primarily focus on nitrogenated derivatives, which are prevalent in the atmosphere and not as relevant in soil. Additionally, PAH products commonly found in the environment can range from as large as five rings to as small as a single ring, requiring a broadly inclusive methodology to comprehensively evaluate mutagenic potential. We developed a combination of supervised and unsupervised machine learning methods to predict environmentally induced PAH mutagenicity with improved performance over currently available tools. K-means clustering with principal component analysis allows us to identify molecular clusters that we hypothesize to have similar mechanisms of action. Recursive feature elimination identifies the most influential descriptors. The cluster-specific regression outperforms available classifiers in predicting direct-acting mutagens resulting from the microbial biodegradation of PAHs and provides direction for future studies evaluating the environmental hazards resulting from PAH biodegradation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.