BackgroundThe metabolic network of a cell is the complete set of interconnected metabolic processes that determine the physiological and biochemical properties of the cell. In recent years, metabolic networks have enormously contributed to our understanding of metabolic genotype and phenotype relationship. This leads to important applications through systems biology and metabolic engineering. Recently, metaproteome-scale metabolic network reconstruction has also emerged as a promising and challenging approach for investigating the metabolism of microbial communities [1].In recent years, there has been an effort to reconstruct the genomescale metabolic networks for hundreds species [2][3][4][5][6][7]. In principle, the reconstruction of metabolic networks is an iterative multi-stage process [8,9], which starts from gene annotation, and goes all the way to network development. Several sophisticated techniques have been developed for metabolic network reconstruction [10][11][12][13][14][15].
Metabolic Gaps and their ImplicationsHowever, most of the reconstructed networks remain incomplete, namely there are significant numbers of metabolic gaps [16,17]. A metabolic gap in a network for genome (G) is a metabolic reaction (R) (described by its EC number) that is present in the network. But the annotation through the network reconstruction methods have failed to find the corresponding gene in G that is responsible for that reaction R. We distinguish two types of metabolic gaps: local metabolic gap where the corresponding gene responsible for R can be found in other related organisms and global metabolic gap where the corresponding gene responsible for R has not been found in any known organism or have not been so annotated in any genome. [5], and bacterium Streptomyces coelicolor [6], between 6% to 19% of the biochemical reactions are metabolic gaps.
Metabolic gaps impede downstream biological analysis of these metabolic networks. For examples, in the reconstructed metabolic networks of yeast Saccharomyces cerevisiae [2], filamentous fungi Aspergillus oryzae [3], Aspergillus nidulans [4], and Aspergillus nigerFilling these metabolic gaps (and thus, enhancing these networks) is the most time-consuming task that may take years to complete since there is a lot of manual curation involved [16][17][18]. This can be a bottleneck for gaining high quality metabolic networks. Our work therefore proposes to fill those metabolic gaps by considering new algorithmic methods.
Current Metabolic Gap Filling MethodsCurrent direct methods for filling local metabolic gaps (i.e. genes that are un-annotated in the target organism, but have been found in Abstract Background: A bottleneck in investigating the cellular metabolism and physiology of organisms is the presence of metabolic gaps in the genome-scale metabolic networks. Metabolic gaps are reactions in the network that the corresponding genes have not yet been identified. Previous gap filling methods are generally based on identifying protein family in related organisms and then use ...