Numerous regression-based and machine learning techniques are available for the development of linear and nonlinear QSAR models that can accurately predict biological endpoints. Such tools can be quite powerful in the hands of an experienced modeler, but too frequently a disconnect remains between the modeler and project chemist because the resulting QSAR models are effectively black boxes. As a result, learning methods that yield models that can be visualized in the context of chemical structures are in high demand. In this work, we combine direct kernel-based PLS with Canvas 2D fingerprints to arrive at predictive QSAR models that can be projected onto the atoms of a chemical structure, allowing immediate identification of favorable and unfavorable characteristics. The method is validated using binding affinities for ligands from 10 different protein targets covering 7 distinct protein families. Models with significant predictive ability (test set Q(2)> 0.5) are obtained for 6 of 10 data sets, and fingerprints are shown to consistently outperform large collections of classical physicochemical and topological descriptors. In addition, we demonstrate how a simple bootstrapping technique may be employed to obtain uncertainties that provide meaningful estimates of prediction accuracy.
In recent years, generative machine learning approaches have attracted significant attention as an enabling approach for designing novel molecular materials with minimal design bias and thereby realizing more directed design for a specific materials property space. Further, data-driven approaches have emerged as a new tool to accelerate the development of novel organic electronic materials for organic light-emitting diode (OLED) applications. We demonstrate and validate a goal-directed generative machine learning framework based on a recurrent neural network (RNN) deep reinforcement learning approach for the design of hole transporting OLED materials. These large-scale molecular simulations also demonstrate a rapid, cost-effective method to identify new materials in OLEDs while also enabling expansion into many other verticals such as catalyst design, aerospace, life science, and petrochemicals.
Data-driven methods are receiving increasing attention to accelerate materials design and discovery for organic light-emitting diodes (OLEDs). Machine learning (ML) has enabled high-throughput screening of materials properties to suggest new candidates for organic electronics. However, building reliable predictive ML models requires creating and managing a high volume of data that adequately address the complexity of materials’ chemical space. In this regard, active learning (AL) has emerged as a powerful strategy to efficiently navigate the search space by prioritizing the decision-making process for unexplored data. This approach allows a more systematic mechanism to identify promising candidates by minimizing the number of computations required to explore an extensive materials library with diverse variables and parameters. In this paper, we applied a workflow of AL that accounts for multiple optoelectronic parameters to identify materials candidates for hole-transport layers (HTL) in OLEDs. Results of this work pave the way for efficient screening of materials for organic electronics with superior efficiencies before laborious simulations, synthesis, and device fabrication.
A new,
accelerated design scheme for photoinitiators based on an
advanced machine learning framework is studied. Design space for photoinitiators
is set by over 120 unique oxime ester compounds synthesized and measured
for their photosensitivity. Then, an automated machine learning algorithm
is used for rapidly identifying the best quantitative structure–property
relationship (QSPR) models among hundreds that are generated, ranked,
and validated in an automated fashion to predict photosensitivity.
Top-performing models are highly predictive with coefficients of determination
of around 0.8 for compounds that are unknown to the models. Visual
interpretation of the predictive models based on atom-site contributions
offers a clear and intuitive direction to design new photoinitiators.
Based on the machine learning-assisted analysis, three new oxime ester
compounds were pushed for synthesis and further evaluation as novel
photoinitiators. Experimental validation confirms high photosensitivity
in all of the newly synthesized candidates. The work demonstrates
the value of combining synthesis with the automated machine learning
framework as a fast and reliable measure, which provides unbiased
insights often hidden in high-dimensional data space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.