Recent advances in high throughput technologies have generated an abundance of biological information, such as gene expression, protein-protein interaction, and metabolic data. These various types of data capture different aspects of the cellular response to environmental factors. Integrating data from different measurements enhances the ability of modeling frameworks to predict cellular function more accurately and can lead to a more coherent reconstruction of the underlying regulatory network structure. Different techniques, newly developed and borrowed, have been applied for the purpose of extracting this information from experimental data. In this study, we developed a framework to integrate metabolic and gene expression profiles for a hepatocellular system. Specifically, we applied genetic algorithm and partial least square analysis to identify important genes relevant to a specific cellular function. We identified genes 1) whose expression levels quantitatively predict a metabolic function and 2) that play a part in regulating a hepatocellular function and reconstructed their role in the metabolic network. The framework 1) preprocesses the gene expression data using statistical techniques, 2) selects genes using a genetic algorithm and couples them to a partial least squares analysis to predict cellular function, and 3) reconstructs, with the assistance of a literature search, the pathways that regulate cellular function, namely intracellular triglyceride and urea synthesis. This provides a framework for identifying cellular pathways that are active as a function of the environment and in turn helps to uncover the interplay between gene and metabolic networks.The development of high throughput technologies has given rise to a wealth of information entailing new paradigms for analyzing and gaining insight into the biological process. Systems of genes, proteins, and metabolites for a given state are now easily attained, necessitating a systems-based analysis and modeling of cellular processes that integrates the biological information from these different scales. By accomplishing this integration, it is then possible to perturb cellular systems in silico and to provide a platform to predict cellular or physiological function across a range of conditions and thus enable the engineering of biological behavior. Therefore, developing a model capable of integrating gene expression and metabolic data would facilitate this goal.The significance of this goal is evident by the numerous efforts devoted to this area. Generally, the efforts fall into two categories: 1) knowledge-based simulation and 2) data-based pattern discovery. Deterministic and stochastic approaches have been developed for the first category. Deterministic models include Virtual Cell, E-Cell, GEAPSI, and DBSOLVE, and stochastic models include StochSim and M-Cell (see Ref. 1 for a review of these various models). Both deterministic and stochastic approaches rely heavily on the availability of existing biological knowledge to define the model. This i...