BackgroundVibrio (V.) parahaemolyticus causes seafood-borne gastro-intestinal bacterial infections in humans worldwide. It is widely found in marine environments and is isolated frequently from seawater, estuarine waters, sediments and raw or insufficiently cooked seafood. Throughout the food chain, V. parahaemolyticus encounters different temperature conditions that might alter metabolism and pathogenicity of the bacterium. In this study, we performed gene expression profiling of V. parahaemolyticus RIMD 2210633 after exposure to 4, 15, 20, 37 and 42 °C to describe the cold and heat shock response.MethodsGene expression profiles of V. parahaemolyticus RIMD 2210633 after exposure to 4, 15, 20, 37 and 42 °C were investigated via microarray. Gene expression values and RT-qPCR experiments were compared by plotting the log2 values. Moreover, volcano plots of microarray data were calculated to visualize the distribution of differentially expressed genes at individual temperatures and to assess hybridization qualities and comparability of data. Finally, enriched terms were searched in annotations as well as functional-related gene categories using the Database for Annotation, Visualization and Integrated Discovery.ResultsAnalysis of 37 °C normalised transcriptomics data resulted in differential expression of 19 genes at 20 °C, 193 genes at 4 °C, 625 genes at 42 °C and 638 genes at 15 °C. Thus, the largest number of significantly expressed genes was observed at 15 and 42 °C with 13.3 and 13 %, respectively. Genes of many functional categories were highly regulated even at lower temperatures. Virulence associated genes (tdh1, tdh2, toxR, toxS, vopC, T6SS-1, T6SS-2) remained mostly unaffected by heat or cold stress.ConclusionAlong with folding and temperature shock depending systems, an overall temperature-dependent regulation of expression could be shown. Particularly the energy metabolism was affected by changed temperatures. Whole-genome gene expression studies of food related pathogens such as V. parahaemolyticus reveal how these pathogens react to stress impacts to predict its behaviour under conditions like storage and transport.Electronic supplementary materialThe online version of this article (doi:10.1186/s12866-015-0565-7) contains supplementary material, which is available to authorized users.
The problem of Bayesian filtering and smoothing in nonlinear models with additive noise is an active area of research. Classical Taylor series as well as more recent sigma-point based methods are two well-known strategies to deal with this problem. However, these methods are inherently sequential and do not in their standard formulation allow for parallelization in the time domain. In this paper, we present a set of parallel formulas that replace the existing sequential ones in order to achieve lower time (span) complexity. Our experimental results done with a graphics processing unit (GPU) illustrate the efficiency of the proposed methods over their sequential counterparts.
BackgroundIn bioprocess development, the needs of data analysis include (1) getting overview to existing data sets, (2) identifying primary control parameters, (3) determining a useful control direction, and (4) planning future experiments. In particular, the integration of multiple data sets causes that these needs cannot be properly addressed by regression models that assume linear input-output relationship or unimodality of the response function. Regularized regression and random forests, on the other hand, have several properties that may appear important in this context. They are capable, e.g., in handling small number of samples with respect to the number of variables, feature selection, and the visualization of response surfaces in order to present the prediction results in an illustrative way.ResultsIn this work, the applicability of regularized regression (Lasso) and random forests (RF) in bioprocess data mining was examined, and their performance was benchmarked against multiple linear regression. As an example, we used data from a culture media optimization study for microbial hydrogen production. All the three methods were capable in providing a significant model when the five variables of the culture media optimization were linearly included in modeling. However, multiple linear regression failed when also the multiplications and squares of the variables were included in modeling. In this case, the modeling was still successful with Lasso (correlation between the observed and predicted yield was 0.69) and RF (0.91).ConclusionWe found that both regularized regression and random forests were able to produce feasible models, and the latter was efficient in capturing the non-linearity in the data. In this kind of a data mining task of bioprocess data, both methods outperform multiple linear regression.
This paper presents algorithms for the parallelization of inference in hidden Markov models (HMMs). In particular, we propose a parallel forward-backward type of filtering and smoothing algorithm as well as a parallel Viterbi-type maximuma-posteriori (MAP) algorithm. We define associative elements and operators to pose these inference problems as all-prefixsums computations and parallelize them using the parallel-scan algorithm. The advantage of the proposed algorithms is that they are computationally efficient in HMM inference problems with long time horizons. We empirically compare the performance of the proposed methods to classical methods on a highly parallel graphics processing unit (GPU).
AbstrAct:In this paper, we study the problem of feature selection in cancer-related machine learning tasks. In particular, we study the accuracy and stability of different feature selection approaches within simplistic machine learning pipelines. Earlier studies have shown that for certain cases, the accuracy of detection can easily reach 100% given enough training data. Here, however, we concentrate on simplifying the classification models with and seek for feature selection approaches that are reliable even with extremely small sample sizes. We show that as much as 50% of features can be discarded without compromising the prediction accuracy. Moreover, we study the model selection problem among the 1 regularization path of logistic regression classifiers. To this aim, we compare a more traditional cross-validation approach with a recently proposed Bayesian error estimator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.