Mammalian Central Nervous System (CNS) neurons regrow their axons poorly following injury, resulting in irreversible functional losses. Identifying therapeutics that encourage CNS axon repair has been difficult, in part because multiple etiologies underlie this regenerative failure. This suggests a particular need for drugs that engage multiple molecular targets. Although multi-target drugs are generally more effective than highly selective alternatives, we lack systematic methods for discovering such drugs. Target-based screening is an efficient technique for identifying potent modulators of individual targets. In contrast, phenotypic screening can identify drugs with multiple targets; however, these targets remain unknown. To address this gap, we combined the two drug discovery approaches using machine learning and information theory. We screened compounds in a phenotypic assay with primary CNS neurons and also in a panel of kinase enzyme assays. We used learning algorithms to relate the compounds’ kinase inhibition profiles to their influence on neurite outgrowth. This allowed us to identify kinases that may serve as targets for promoting neurite outgrowth, as well as others whose targeting should be avoided. We found that compounds that inhibit multiple targets (polypharmacology) promote robust neurite outgrowth in vitro. One compound with exemplary polypharmacology, was found to promote axon growth in a rodent spinal cord injury model. A more general applicability of our approach is suggested by its ability to deconvolve known targets for a breast cancer cell line, as well as targets recently shown to mediate drug resistance.
Glucose is a simple sugar that plays an essential role in many basic metabolic and signaling pathways. Many proteins have binding sites that are highly specific to glucose. The exponential increase of genomic data has revealed the identity of many proteins that seem to be central to biological processes, but whose exact functions are unknown. Many of these proteins seem to be associated with disease processes. Being able to predict glucose-specific binding sites in these proteins will greatly enhance our ability to annotate protein function and may significantly contribute to drug design. We hereby present the first glucose-binding site classifier algorithm. We consider the sugar-binding pocket as a spherical spatio-chemical environment and represent it as a vector of geometric and chemical features. We then perform Random Forests feature selection to identify key features and analyze them using support vector machines classification. Our work shows that glucose binding sites can be modeled effectively using a limited number of basic chemical and residue features. Using a leave-one-out cross-validation method, our classifier achieves a 8.11% error, a 89.66% sensitivity and a 93.33% specificity over our dataset. From a biochemical perspective, our results support the relevance of ordered water molecules and ions in determining glucose specificity. They also reveal the importance of carboxylate residues in glucose binding and the high concentration of negatively charged atoms in direct contact with the bound glucose molecule.
Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions.Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.
Breast cancer is the leading cause of cancer mortality in women between the ages of 15 and 54. During mammography screening, radiologists use a strict lexicon (BI-RADS) to describe and report their findings. Mammography records are then stored in a well-defined database format (NMD). Lately, researchers have applied data mining and machine learning techniques to these databases. They successfully built breast cancer classifiers that can help in early detection of malignancy. However, the validity of these models depends on the quality of the underlying databases. Unfortunately, most databases suffer from inconsistencies, missing data, inter-observer variability and inappropriate term usage. In addition, many databases are not compliant with the NMD format and/or solely consist of text reports. BI-RADS feature extraction from free text and consistency checks between recorded predictive variables and text reports are crucial to addressing this problem. We describe a general scheme for concept information retrieval from free text given a lexicon, and present a BI-RADS features extraction algorithm for clinical data mining. It consists of a syntax analyzer, a concept finder and a negation detector. The syntax analyzer preprocesses the input into individual sentences. The concept finder uses a semantic grammar based on the BI-RADS lexicon and the experts’ input. It parses sentences detecting BI-RADS concepts. Once a concept is located, a lexical scanner checks for negation. Our method can handle multiple latent concepts within the text, filtering out ultrasound concepts. On our dataset, our algorithm achieves 97.7% precision, 95.5% recall and an F1-score of 0.97. It outperforms manual feature extraction at the 5% statistical significance level.
Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.