Speeding-up hirschberg and hunt-szymanski LCS algorithms

Crochemore, Maxime; Iliopoulos, CS; Pinzón, Yoan J.

doi:10.1109/spire.2001.989737

Cited by 9 publications

(16 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• BBB-a binary branch and bound method [8] of complexity O((mn + log σ )σ ) in the worst case and O((mn + log log σ ) log σ ) in the best case, • CDP-a classical dynamic programming repeated for all t ∈ [−σ, σ ] of complexity O(mnσ ) [10], • KBB-a k-ary branch and bound method [8] of complexity O((mn + log(σ k/(k − 1)))σ k/(k − 1)) in the worst case and O((mn + log(k log k σ ))k × log k σ ) in the best case (we used k = 3 in our experiments, following [8] where the authors found that this value is the best in their experiments; the choice of k was also confirmed in our preliminary experiments), • SDP-a sparse dynamic programming [11] of complexity O(mn log m), • YBP-a bit-parallel algorithm [2] of complexity O(mn σ/w ), where w is the machine word size (in bits), • HBP-a bit-parallel LCS algorithm [6] repeated for all possible t values of complexity O( n/w mσ ), • NGMD-a recent algorithm [14] of complexity O(mn log log σ ).…”

Section: Background and Related Workmentioning

confidence: 90%

Speeding up transposition-invariant string matching

Deorowicz¹

2006

Information Processing Letters

View full text Add to dashboard Cite

Section: Background and Related Workmentioning

confidence: 90%

Speeding up transposition-invariant string matching

Deorowicz¹

2006

Information Processing Letters

View full text Add to dashboard Cite

“…In particular both the points we mentioned are part of the MLCS in the top box of Figure 1. On the contrary the match (2,9,4,8), corresponding to letter 'B', does not dominate the match (4, 2, 1, 1), because in the first coordinate we have 2 < 4. These two matches are not compatible and therefore may not both occur in a MCS.…”

Section: The Problemmentioning

confidence: 95%

“…1 (1,9,4,8); (4,2,1,1) ; (15,1,2,2) 2 (2,11,8,9); (4,10,5,10); (5,9,4,8); (6,3,5,3) ; (15,4,2,2); (16,2,5,3) 3 (3,12,9,14); (4,13,10,10); (5,11,8,14); (6,10,5,10); (7,5,7,4) ; (15,4,6,…”

Section: B B B a B A A A A A B B B A C A A B C B B C A A C A C A C B ...mentioning

confidence: 99%

Incremental Multiple Longest Common Sub-Sequences

S.Russo¹,

Francisco²,

Rocher³

2020

Preprint

View full text Add to dashboard Cite

We consider the problem of updating the information about multiple longest common sub-sequences. This kind of sub-sequences is used to highlight information that is shared across several information sequences, therefore it is extensively used namely in bioinformatics and computational genomics. In this paper we propose a way to maintain this information when the underlying sequences are subject to modifications, namely when letters are added and removed from the extremes of the sequence. Experimentally our data structure obtains significant improvements over the state of the art.

show abstract

“…Since the amount of malware is increasing, we need a faster algorithm for finding an LCS. Crochemore et al [10] have proposed a bit-vector algorithm with a processing time of O( MN w ), where w is the number of bits in a machine word. The method assigns one bit to a cell in the DP matrix, and calculates w cells in bulk using four operations (and, or, not and add).…”

Section: A Lcs Problem and Bit-vector Algorithmmentioning

confidence: 99%

Towards Efficient Analysis for Malware in the Wild

Itoh

Muraoka

2011

2011 IEEE International Conference on Communications (ICC)

View full text Add to dashboard Cite

We propose two novel techniques for reducing the workload for malware analysis. The first technique is restricted instruction, which accelerates finding the longest common subsequence (LCS) between machine code instruction sequences of malware. The second technique is probabilistic disassembly, which can find the most probable disassembly result of a binary stream without a clue, such as debug symbols or the information of import functions. By combining the two proposals and our generic unpacker, we built an automatic malware classification system. Given an unknown malware program, the system enables malware analysts to find the most similar known malware program to this unknown one, and even estimate different/common instructions. In one of our experiments, we classified 3,233 malware samples in the wild and concluded that 75% of the samples belong to the seven largest clusters. As a result, only seven samples, one from each cluster, were required to be analyzed in order to reveal the functionality of the rest of the 75%, showing a great increase in efficiency of analysis.

show abstract

Speeding-up hirschberg and hunt-szymanski LCS algorithms

Cited by 9 publications

References 16 publications

Speeding up transposition-invariant string matching

Speeding up transposition-invariant string matching

Incremental Multiple Longest Common Sub-Sequences

Towards Efficient Analysis for Malware in the Wild

Contact Info

Product

Resources

About