Applications of machine learning (ML) to synthetic chemistry rely on the assumption that large numbers of literature-reported examples should enable construction of accurate and predictive models of chemical reactivity. This paper demonstrates that abundance of carefully curated literature data may be insufficient for this purpose. Using an example of Suzuki–Miyaura coupling with heterocyclic building blocks—and a carefully selected database of >10,000 literature examples—we show that ML models cannot offer any meaningful predictions of optimum reaction conditions, even if the search space is restricted to only solvents and bases. This result holds irrespective of the ML model applied (from simple feed-forward to state-of-the-art graph-convolution neural networks) or the representation to describe the reaction partners (various fingerprints, chemical descriptors, latent representations, etc.). In all cases, the ML methods fail to perform significantly better than naive assignments based on the sheer frequency of certain reaction conditions reported in the literature. These unsatisfactory results likely reflect subjective preferences of various chemists to use certain protocols, other biasing factors as mundane as availability of certain solvents/reagents, and/or a lack of negative data. These findings highlight the likely importance of systematically generating reliable and standardized data sets for algorithm training.
General conditions for organic reactions are important but rare, and efforts to identify them usually consider only narrow regions of chemical space. Discovering more general reaction conditions requires considering vast regions of chemical space derived from a large matrix of substrates crossed with a high-dimensional matrix of reaction conditions, rendering exhaustive experimentation impractical. Here, we report a simple closed-loop workflow that leverages data-guided matrix down-selection, uncertainty-minimizing machine learning, and robotic experimentation to discover general reaction conditions. Application to the challenging and consequential problem of heteroaryl Suzuki-Miyaura cross-coupling identified conditions that double the average yield relative to a widely used benchmark that was previously developed using traditional approaches. This study provides a practical road map for solving multidimensional chemical optimization problems with large search spaces.
Here, a unique visible-light-induced method for the organochalcogenation of the sp2 C–H bonds of indoles and aniline has been presented using diaryl dichalcogenides (S, Se, and Te) and oxygen as an oxidant avoiding a photocatalyst, base, catalyst, and reagent in acetone.
A series of functionalized phenyl oxazole derivatives was designed, synthesized and screened in vitro for their activities against LSD1 and for effects on viability of cervical and breast cancer cells, and in vivo for effects using zebrafish embryos. These compounds are likely to act via multiple epigenetic mechanisms specific to cancer cells including LSD1 inhibition.Histones are proteins that bind to DNA and facilitate efficient 'packing' of DNA in eukaryotic cells. 1 They are highly basic in nature due to the presence of positively charged amino acid side chains causing the DNA to fold around them into compact structures (nucleosomes). The ability of histones to regulate gene expression is controlled by post-translational modifications on the N-terminal (and likely C-terminal) "tails" of the core histones which project out of the nucleosome. Modifications including phosphorylation, acetylation, methylation, ubiquitination, sumoylation and biotinylation have been identified on the histone tails. 2 The cellular roles of histone lysine acetylation/deacetylation are the best characterized of the histone modifications -acetylation normally activates transcription whereas deacetylation deactivates this process. 3 Although there are clear links to disease, 4-8 the role of histone methylation is much less understood and appears to be context dependent.LSD1 (lysine specific demethylase 1) was the first histone demethylase identified. Its discovery significantly advanced the understanding of epigenetic regulation of gene expression, changing the paradigm that methylation is a non-reversible feature of histones. 9 Histone methylation/demethylation has since been found to be an important epigenetic modification linked to activation as well as repression of transcription. Two types of histone demethylases have been discovered. The flavin-dependent demethylase LSD1 acts on lysine 4 and lysine 9 of histone H3 (H3K4 and H3K9). LSD1 selectively catalyzes the oxidation of the methyl group of mono-and dimethylated lysines resulting in an imine intermediate and generation of hydrogen peroxide. The imine product is non-enzymatically hydrolysed to generate a carbinolamine resulting in demethylated lysine and formaldehyde release. 10 The other major class, i.e. Jumonji domain-containing histone demethylases, are Fe(II) and 2-oxoglutarate dependent oxygenases that act on mono-, di-and trimethylated Lys and methylated Arg residues depending on the particular enzymes. 11 Histone demethylase activity is associated with several pathological states. Increased LSD1 expression in prostate tumors correlates significantly with relapse during therapy. 6,7 Suppressed LSD1 expression is associated with vascular smooth muscle cell inflammatory damage in a mouse model of diabetes. 8 Demethylation of p53 (tumor suppressor) by LSD1 prevents p53 interaction with its co-activator 53BP1. 5 Activation of the telomerase reverse transcriptase (hTERT) gene is known to be dependent on LSD1 levels and recruitment to the hTERT promoter. 4 Studies on LSD1 hav...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.