BackgroundIdentifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value.ResultsIn this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns.ConclusionsOur approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level.
Infection by microorganisms may cause fatally erroneous interpretations in the biologic researches based on cell culture. The contamination by microorganism in the cell culture is quite frequent (5% to 35%). However, current approaches to identify the presence of contamination have many limitations such as high cost of time and labor, and difficulty in interpreting the result. In this paper, we propose a model to predict cell infection, using a microarray technique which gives an overview of the whole genome profile. By analysis of 62 microarray expression profiles under various experimental conditions altering cell type, source of infection and collection time, we discovered 5 marker genes, NM_005298, NM_016408, NM_014588, S76389, and NM_001853. In addition, we discovered two of these genes, S76389, and NM_001853, are involved in a Mycolplasma-specific infection process. We also suggest models to predict the source of infection, cell type or time after infection. We implemented a web based prediction tool in microarray data, named Prediction of Microbial Infection (http://www.snubi.org/software/PMI).
Biological pathways are known as collections of knowledge of certain biological processes. Although knowledge about a pathway is quite significant to further analysis, it covers only tiny portion of genes that exists. In this paper, we suggest a model to extend each individual pathway using a microarray expression data based on the known knowledge about the pathway. We take the Rosetta compendium dataset to extend pathways of Saccharomyces cerevisiae obtained from KEGG (Kyoto Encyclopedia of genes and genomes) database. Before applying our model, we verify the underlying assumption that microarray data reflect the interactive knowledge from pathway, and we evaluate our scoring system by introducing performance function. In the last step, we validate proposed candidates with the help of another type of biological information. We introduced a pathway extending model using its intrinsic structure and microarray expression data. The model provides the suitable candidate genes for each single biological pathway to extend it.
Cluster analysis is the most important method for analyzing large-scale gene expression patterns. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes that create topdown hierarchical tree are a useful platform for developing optimization algorithms to determine the 'optimal' clusters from a pairwise proximity matrix which represents completely connected and weighted graph [ll. Evolution strategy is applied to determine the 'globally optimal' incisional hyperplanes to construct hierarchical tree structure and tested with Fisher's iris and Golub's leukemia data sets. The results were compared with those of bottom-up hierarchical clustering, K-means and SOMs(Se1f-Organizing Maps) algorithms with promising results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.