Abstract:Discovery and optimization of new
catalysts can be potentially
accelerated by efficient data analysis using machine-learning (ML).
In this paper, we record the process of searching for additives in
the electrochemical deposition of Cu catalysts for CO2 reduction
(CO2RR) using ML, which includes three iterative cycles:
“experimental test; ML analysis; prediction and redesign”.
Cu catalysts are known for CO2RR to obtain a range of products
including C1 (CO, HCOOH, CH4, CH3OH) and C2+ (C2H4, C2H6, C2H5OH, C3H7OH)… Show more
“…Majority of research dedicated to implement ML in drug discovery/chemistry employs a very narrow range of potential models, even just one [49][50][51] , without a clear rationale for the selection of the algorithms included in the pool assessed 5,18,[52][53][54][55] . Here instead, we purposely screened a large number of potential algorithms based on different approaches (e.g.…”
Despite the large prevalence of diseases affecting cartilage (e.g. knee osteoarthritis affecting 16% of population globally), no curative treatments are available because of the limited capacity of drugs to localise in such tissue caused by low vascularisation and electrostatic repulsion. While an effective delivery system is sought, the only option is using high drug doses that can lead to systemic side effects. We introduced poly-beta-amino-esters (PBAEs) to effectively deliver drugs into cartilage tissues. PBAEs are copolymer of amines and di-acrylates further end-capped with other amine; therefore encompassing a very large research space for the identification of optimal candidates. In order to accelerate the screening of all possible PBAEs, the results of a small pool of polymers (n = 90) were used to train a variety of machine learning (ML) methods using only polymers properties available in public libraries or estimated from the chemical structure. Bagged multivariate adaptive regression splines (MARS) returned the best predictive performance and was used on the remaining (n = 3915) possible PBAEs resulting in the recognition of pivotal features; a further round of screening was carried out on PBAEs (n = 150) with small variations of structure of the main candidates from the first round. The refinements of such characteristics enabled the identification of a leading candidate predicted to improve drug uptake > 20 folds over conventional clinical treatment; this uptake improvement was also experimentally confirmed. This work highlights the potential of ML to accelerate biomaterials development by efficiently extracting information from a limited experimental dataset thus allowing patients to benefit earlier from a new technology and at a lower price. Such roadmap could also be applied for other drug/materials development where optimisation would normally be approached through combinatorial chemistry.
“…Majority of research dedicated to implement ML in drug discovery/chemistry employs a very narrow range of potential models, even just one [49][50][51] , without a clear rationale for the selection of the algorithms included in the pool assessed 5,18,[52][53][54][55] . Here instead, we purposely screened a large number of potential algorithms based on different approaches (e.g.…”
Despite the large prevalence of diseases affecting cartilage (e.g. knee osteoarthritis affecting 16% of population globally), no curative treatments are available because of the limited capacity of drugs to localise in such tissue caused by low vascularisation and electrostatic repulsion. While an effective delivery system is sought, the only option is using high drug doses that can lead to systemic side effects. We introduced poly-beta-amino-esters (PBAEs) to effectively deliver drugs into cartilage tissues. PBAEs are copolymer of amines and di-acrylates further end-capped with other amine; therefore encompassing a very large research space for the identification of optimal candidates. In order to accelerate the screening of all possible PBAEs, the results of a small pool of polymers (n = 90) were used to train a variety of machine learning (ML) methods using only polymers properties available in public libraries or estimated from the chemical structure. Bagged multivariate adaptive regression splines (MARS) returned the best predictive performance and was used on the remaining (n = 3915) possible PBAEs resulting in the recognition of pivotal features; a further round of screening was carried out on PBAEs (n = 150) with small variations of structure of the main candidates from the first round. The refinements of such characteristics enabled the identification of a leading candidate predicted to improve drug uptake > 20 folds over conventional clinical treatment; this uptake improvement was also experimentally confirmed. This work highlights the potential of ML to accelerate biomaterials development by efficiently extracting information from a limited experimental dataset thus allowing patients to benefit earlier from a new technology and at a lower price. Such roadmap could also be applied for other drug/materials development where optimisation would normally be approached through combinatorial chemistry.
“…Recently, Wang et al have provided a new path for additive selection and optimization by combining experiment and theory with efficient data analysis through machine guidance (Figure 6c) [43] . They indicate that Sn salt can be used as an important additive for the production of CO and There are many influencing factors of electrodeposition, due to its huge technical parameter space.…”
In recent years, electrocatalytic reduction of CO2 has been a focus in the research field. There are also various methods for synthesizing catalysts, such as hydrothermal method, arc method, electrospinning,...
“…564 of the features come from molecular fragment fingerprint (MFF) featurization. 34 In MFF, molecular fragments were generated by the extended-connectivity fingerprints (ECFP) method using a radius of 436 supported by the Deepchem python toolkit. 35 A vector recording the appearance times of each fragment in a molecule 36 was then created (Figure 2).…”
Molecules with strong two-photon absorption (TPA) are important in many advanced applications such as upconverted laser and photodynamic therapy, but their design is hampered by the high cost of experimental screening and accurate quantum chemical (QC) calculations. Here we perform a systematic study by collecting and analyzing with interpretable machine learning (ML) experimental TPA database with ca. 900 molecules. We uncovered that only very few molecular features are sufficient to explain the TPA magnitudes. The most important feature is conjugation length (rather than area as believed before) followed by features reflecting effects of donor and acceptor substitution and coplanarity. These features are used to create a very fast ML model with prediction errors of similar magnitude compared to experimental and affordable QC meth-ods errors. Our ML model has the potential for high-throughput screening as additionally validated with our new experimental measurements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.