For a given parameterized problem, π , a kernelization algorithm is a polynomial-time pre-processing procedure that transforms an arbitrary instance of π into an equivalent one whose size depends only on the input parameter(s). The resulting instance is called a problem kernel. In this paper, a kernelization algorithm for the 3-Hitting Set problem is presented along with a general kernelization for d-Hitting Set. For 3-Hitting Set, an arbitrary instance is reduced into an equivalent one that contains at most 5k 2 +k elements.This kernelization is an improvement over previously known methods that guarantee cubic-order kernels. Our method is used also to obtain quadratic kernels for several other problems. For a constant d 3, a kernelization of d-Hitting Set is achieved by a non-trivial generalization of the 3-Hitting Set method, and guarantees a kernel whose order does not exceed (2d − 1)k d−1 + k.
Algorithmic methods based on the theory of fixed-parameter tractability are combined with powerful computational platforms to launch systematic attacks on combinatorial problems of significance. As a case study, optimal solutions to very large instances of the N P-hard vertex cover problem are computed. To accomplish this, an efficient sequential algorithm and various forms of parallel algorithms are devised, implemented, and compared. The importance of maintaining a balanced decomposition of the search space is shown to be critical to achieving scalability. Target problems need only be amenable to reduction and decomposition. Applications in high throughput computational biology are also discussed.
Graph-theoretical approaches to biological network analysis have proven to be effective for small networks but are computationally infeasible for comprehensive genome-scale systems-level elucidation of these networks. The difficulty lies in the NP-hard nature of many global systems biology problems that, in practice, translates to exponential (or worse) run times for finding exact optimal solutions. Moreover, these problems, especially those of an enumerative flavor, are often memory-intensive and must share very large sets of data effectively across many processors. For example, the enumeration of maximal cliques -a core component in gene expression networks analysis, cis regulatory motif finding, and the study of quantitative trait loci for high-throughput molecular phenotypes -can result in as many as 3 n/3 maximal cliques for a graph with n vertices. Memory requirements to store those cliques reach terabyte scales even on modest-sized genomes. Emerging hardware architectures with ultra-large globally addressable memory such as the SGI Altix and Cray X1 seem to be well suited for addressing these types of data-intensive problems in systems biology. This paper presents a novel framework that provides exact, parallel and scalable solutions to various graph-theoretical approaches to genome-scale elucidation of biological networks. This framework takes advantage of these large-memory architectures by creating globally addressable bitmap memory indices with potentially high compression rates, fast bitwise-logical operations, and reduced search space. Augmented with recent theoretical advancements based on fixed-parameter tractability, this framework produces computationally feasible performance for genome-scale combinatorial problems of systems biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.