Tuning the boyer‐moore‐horspool string searching algorithm

Raita, T.

doi:10.1002/spe.4380221006

Cited by 66 publications

(34 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In fact, they proposed two algorithms, one for presentation and another which performs best in practical applications (Hume and Sunday, 1991). Since then, some improvements or modifications have been added and parts of the algorithm have been proposed as non beneficial in certain situation (Hume and Sunday, 1991, Horspool, 1980, Raita, 1992. In general, the algorithm and its variants build on the idea that comparing a pattern from its rightmost end to the text in question allows for larger shifts which can be precomputed from the pattern.…”

Section: Boyer-moore String Matching Algorithmmentioning

confidence: 99%

“…This is even more pronounced when a large number of matches are expected in the text or if the suffix of the pattern is abundant in the text. Raita created a variant of the BMH algorithm which introduced sentinels in order to speed up searches by first comparing the parts of the pattern with the weakest dependencies (Raita, 1992). He reported an improvement of approximately 25% over the BMH algorithm but it has been shown by Smith to be solely due to sentinel use, as opposed to character dependencies within the pattern, as Raita concluded (Smith, 1994).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Exact pattern matching: Adapting the Boyer-Moore algorithm for DNA searches

Allmer

2016

Preprint

View full text Add to dashboard Cite

Exact pattern matching aims to locate all occurrences of a pattern in a text. Many algorithms have been proposed, but two algorithms, the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM), are most widespread. It is the basis of some approximate string matching algorithms like BLAST, and in many cases it is desirable to locate an exact rather than approximate matches. Although several studies included measures with small alphabets, none of them specifically designed an algorithm to target nucleotide sequences. Since there are also no application programming interfaces available for pattern matching in nucleotide sequences, these two issues were aimed to be resolved. A AbstractExact pattern matching aims to locate all occurrences of a pattern in a text. Many algorithms have been proposed, but two algorithms, the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM), are most widespread. It is the basis of some approximate string matching algorithms like BLAST, and in many cases it is desirable to locate an exact rather than approximate matches. Although several studies included measures with small alphabets, none of them specifically designed an algorithm to target nucleotide sequences. Since there are also no application programming interfaces available for pattern matching in nucleotide sequences, these two issues were aimed to be resolved. A portion of the Chlamydomonas reinhardtii genome (30 mega bases) was searched with queries ranging from 10 to 2000 nucleotides and an alternating number of matches between one and 25000. The results indicate that the use of two of the algorithms developed in this study is sufficient to efficiently cover the complete search space as presented in the experiment conducted here. Thus the aim of implementing an algorithm specifically targeting pattern matching in nucleotide sequences and making it available to the general public as an advanced programming interface was achieved. All algorithms are freely available at:

show abstract

Section: Boyer-moore String Matching Algorithmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Exact pattern matching: Adapting the Boyer-Moore algorithm for DNA searches

Allmer

2016

Preprint

View full text Add to dashboard Cite

show abstract

“…The major advantage of this method is flexibility in adapting to different edit distance functions. The Raita algorithm [10] utilizes the same approach as Horspool algorithm [6] to obtaining the shift value after an attempt. Instead of comparing each character in the pattern with the sliding window from right to left, the order of comparison in Raita algorithm [10] is carried out by first comparing the rightmost and leftmost characters of the pattern with the sliding window.…”

Section: Cpc-character Per Comparison Ratiomentioning

confidence: 99%

Exact Multiple Pattern Matching Algorithm using DNA Sequence and Pattern Pair

Bhukya¹,

Somayajulu²

2011

IJCA

View full text Add to dashboard Cite

Exact string matching algorithms are essential components in DNA applications of the computational biology. Pattern matching is an important task of the pattern discovery process in today's world for finding the structural and functional behavior in proteins and genes. Although pattern matching is commonly used in computer science and information processing, it can be found in everyday tasks. Molecular biologists often search for the important information from the databases in different directions of different uses. With the increasing need for instant information, pattern matching will continue to grow and change as needed from time to time. In this research we propose a new pattern matching technique called an Exact multiple pattern matching algorithms using DNA sequence and pattern pair. The current approach is used to avoid unnecessary comparisons in the DNA sequence. Due to this, the number of comparisons gradually decreases and comparison per character ratio of the proposed algorithm reduces accordingly when compared to the some of the existing popular methods. Proposed algorithm is implemented and compared with existing algorithms. Comparison results demonstrate that index based algorithm is efficient than the number of the existing techniques.

show abstract

“…String matching [2] algorithms are used in most of the real world applications where pattern extraction is required like as Intrusion Detection system [3][4], Plagiarism detection [5], Data Mining [6] and Bioinformatics [7]. Bit parallel algorithms are faster than the other benchmark character based algorithms like as KMP [4] [8], BM [9][10], BMH [11] [12], BMHS [13], BMHS2 [14], BMI [15], Improved BMHS [16], Cmmentz Walter [17][18], Wu Manber [19] [20] and Aho-Corasick [21] [22] etc. Bit Parallel algorithms [23] are based on the non deterministic automata but there is no such automata are present.…”

Section: Introductionmentioning

confidence: 99%

Bit Parallel String Matching Algorithms: A Survey

Gupta¹,

Rasool²

2014

IJCA

View full text Add to dashboard Cite

The intrinsic parallelism in bit operations like AND/OR inside a computer word is known as bit parallelism. Since 1992, this bit parallelism is directly used in string matching for matching efficiency improvement. Some of the popular bit parallel string matching algorithms Shift OR, Shift OR with Q-Gram, BNDM, TNDM, SBNDM, LBNDM, FBNDM, BNDMq, and Multiple pattern BNDM. This paper discusses the working of various bit parallel string matching algorithms with example. Here we present how bit parallelism is useful for efficiency improvement in various algorithms.

show abstract

Tuning the boyer‐moore‐horspool string searching algorithm

Cited by 66 publications

References 9 publications

Exact pattern matching: Adapting the Boyer-Moore algorithm for DNA searches

Exact pattern matching: Adapting the Boyer-Moore algorithm for DNA searches

Exact Multiple Pattern Matching Algorithm using DNA Sequence and Pattern Pair

Bit Parallel String Matching Algorithms: A Survey

Contact Info

Product

Resources

About