Abstract-The most strings with few bad columns problem is an NP-hard combinatorial optimization problem from the bioinformatics field. This paper presents the first integer linear programming model for this problem. Moreover, a simple greedy heuristic and a more sophisticated extension, namely a greedybased pilot method, are proposed. Experiments show that, as expected, the greedy-based pilot method improves over the greedy strategy. For problem instances of small and medium size the best results were obtained by solving the integer linear programming model by CPLEX, while the greedy-based pilot methods scales much better to large problem instances.
Abstract. Both, Construct, Merge Solve and Adapt (CMSA) and Large Neighborhood Search (LNS), are hybrid algorithms that are based on iteratively solving sub-instances of the original problem instances, if possible, to optimality. This is done by reducing the search space of the tackled problem instance in algorithm-specific ways which differ from one technique to the other. In this paper we provide first experimental evidence for the intuition that, conditioned by the way in which the search space is reduced, LNS should generally work better than CMSA in the context of problems in which solutions are rather large, and the opposite is the case for problems in which solutions are rather small. The size of a solution is hereby measured by the number of components of which the solution is composed, in comparison to the total number of solution components. Experiments are conducted in the context of the multi-dimensional knapsack problem.
In this work we consider the following N P -hard combinatorial optimization problem from computational biology. Given a set of input strings of equal length, the goal is to identify a maximum cardinality subset of strings that dier maximally in a pre-dened number of positions. First of all we introduce an integer linear programming model for this problem. Second, two variants of a rather simple greedy strategy are proposed. Finally, a large neighborhood search algorithm is presented. A comprehensive experimental comparison among the proposed techniques shows, rst, that larger neighborhood search generally outperforms both greedy strategies. Second, while large neighborhood search shows to be competitive with the stand-alone application of CPLEX for small and medium sized problem instances, it outperforms CPLEX in the context of larger instances.Keywords: Most strings with few bad columns, integer linear programming, large neighborhood search. * A preliminary version of this work appeared at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.