We present a new descriptor named signature based on extended valence sequence. The signature of an atom is a canonical representation of the atom's environment up to a predefined height h. The signature of a molecule is a vector of occurrence numbers of atomic signatures. Two QSAR and QSPR models based on signature are compared with models obtained using popular molecular 2D descriptors taken from a commercially available software (Molconn-Z). One set contains the inhibition concentration at 50% for 121 HIV-1 protease inhibitors, while the second set contains 12865 octanol/water partitioning coefficients (Log P). For both data sets, the models created by signature performed comparable to those from the commercially available descriptors in both correlating the data and in predicting test set values not used in the parametrization. While probing signature's QSAR and QSPR performances, we demonstrates that for any given molecule of diameter D, there is a molecular signature of height h = D+1, from which any 2D descriptor can be computed. As a consequence of this finding any QSAR or QSPR involving 2D descriptors can be replaced with a relationship involving occurrence number of atomic signatures.
The microbial production of fine chemicals provides a promising biosustainable manufacturing solution that has led to the successful production of a growing catalog of natural products and high-value chemicals. However, development at industrial levels has been hindered by the large resource investments required. Here we present an integrated Design–Build-Test–Learn (DBTL) pipeline for the discovery and optimization of biosynthetic pathways, which is designed to be compound agnostic and automated throughout. We initially applied the pipeline for the production of the flavonoid (2S)-pinocembrin in Escherichia coli, to demonstrate rapid iterative DBTL cycling with automation at every stage. In this case, application of two DBTL cycles successfully established a production pathway improved by 500-fold, with competitive titers up to 88 mg L−1. The further application of the pipeline to optimize an alkaloids pathway demonstrates how it could facilitate the rapid optimization of microbial strains for production of any chemical compound of interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.