Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, ~70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences
There are many ways to build a predictive model from data. Besides the numerous classification or regression algorithms to choose from, there are countless possibilities of useful data transformation prior to modeling. To assist in discovering good predictive analytics workflows, we introduced recently a collaborative analytics system that allows workflow sharing and reuse. We designed a recommendation engine for the system to enable matching of analytics needs with relevant workflows stored in repository. The engine relies on meta-predictive modeling of traffic-analysis workflow-characteristics. In this paper, we present a feasibility study of applying this collaborative analytics system to predict traffic congestion. Different ways to build predictive models from traffic dataset are pooled as shared workflows. We demonstrate that through dynamic recommendation of workflows that are suitable for the real-time varying traffic data, a reliable congestion prediction can be achieved. The promising results showcase that systematic collaboration among data scientists made possible by our system can be a powerful tool to produce very accurate prediction from data.
Abstract-A contact network is the well representation of heterogeneous contact behaviors within the population. Incorporating contact networks as well as community structures is important in realistic modeling and simulation for the spread of infectious diseases. We developed the "HPCgen", a fast and generic generator of contact networks of large urban cities, with the capacity of automating network re-generations for intervention studies. The produced contact networks are applicable in both analytical modeling and agent-based simulations. In this paper, we presented the design and realization of HPCgen followed by the empirical results of building Singapore contact networks with six types of community structures in the common urban settings. The results showed our 8-node parallelized HPCgen could generated a contact network of 3.4 million populations within 62.17 seconds, which is 90% reduction of runtime.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.