In this paper, we define a family of patterns with don't cares occurring in a text. We call them primitive patterns. The set of primitive patterns forms a basis for all the maximal patterns occurring in the text. The number of primitive patterns is smaller than other known basis.We present an incremental algorithm that computes the primitive patterns occurring at least q times in a text of length n, given the N primitive patterns occurring at least q − 1 times, in time O(|Σ|Nn 2 log n), where Σ is the alphabet. In the particular case where q = 2, the complexity in time is only O(|Σ|n 2 log n). We also give an algorithm that decides if a given pattern is primitive in a given text. These algorithms are generalized, taking many sequences in input. Finally, we give a solution for reducing the number of patterns of interest by using scoring techniques, as we show that the number of primitive patterns is exponential.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.