Abstract.The known 2-string LCS problem is generalized to finding a Longest Common Subsequence (LCS) for a set of strings. A new, general approach that systematically enumerates common subsequences is proposed for the solution. Assuming a finite symbol set, it is shown that the presented scheme requires a preprocessing time that grows linearly with the total length of the input strings and a processing time that grows linearly with (K), the number of strings, and (IIPII) the number of matches among them. The only previous algorithm for the generalized LCS problem takes O(K). ISll'lS21'...'[Sgl) execution time, where IS,I denotes the length of the string Si. Since typically IIPII is a very small percentage of tSlt'lS21"...'lSxl, the proposed method may be considered to be much more elScient than the straightforward dynamic programming approach.
An approach to designing very fast algorithms for approximate string matching in a dictionary is proposed. Multiple spelling errors corresponding to insert, delete, change, and transpose operations on character strings are considered in the fault model. The design of very fast approximate string matching algorithms through a four-step reduction procedure is described. The final and most effective step uses hashing techniques to avoid comparing the given word with words at large distances. The technique has been applied to a library book catalog textbase. The experiments show that performing approximate string matching for a large dictionary in real-time on an ordinary sequential computer under our multiple fault model is feasible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.