2013
DOI: 10.1142/s0129054113500159
|View full text |Cite
|
Sign up to set email alerts
|

Ems1: An Elegant Algorithm for Edit Distance Based Motif Search

Abstract: Motifs are biologically significant patterns found in DNA/protein sequences. Given a set of biological sequences, the problem of identifying the motifs is very challenging. This problem has been well studied in computational biology. Identifying motifs through experimental processes is extremely expensive and time consuming. This is one of the factors influencing computational biologists to come up with novel computational methods to predict motifs. Several motif models have been proposed in the literature and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
19
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
2
2
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(19 citation statements)
references
References 5 publications
0
19
0
Order By: Relevance
“…For every ( l,d ) combination we report the average time taken by 4 runs. We compare the following four implementations: EMS1: A modified implementation of the algorithm in [13] which considered the neighborhood of only l -mers whereas the modified version considers the neighborhood of all k -mers where l − d ≤ k ≤ l + d . EMS2: A faster implementation of our sequential algorithm which uses tries for storing candidate motifs where each node of the trie stores an array of pointers to each children of the node. However, this makes the space required to store a tree node dependent on the size of the alphabet Σ .…”
Section: Resultsmentioning
confidence: 99%
“…For every ( l,d ) combination we report the average time taken by 4 runs. We compare the following four implementations: EMS1: A modified implementation of the algorithm in [13] which considered the neighborhood of only l -mers whereas the modified version considers the neighborhood of all k -mers where l − d ≤ k ≤ l + d . EMS2: A faster implementation of our sequential algorithm which uses tries for storing candidate motifs where each node of the trie stores an array of pointers to each children of the node. However, this makes the space required to store a tree node dependent on the size of the alphabet Σ .…”
Section: Resultsmentioning
confidence: 99%
“…Using equations (I), (2), (3) and (4), Pathak et al [13] gave an algorithm that stores all possible candidate motifs in an array of size 1 �l l . However the algorithm is inefficient in generating the neighborhood as the same candidate motif is generated by several combinations of the basic edit operations.…”
Section: Methodsmentioning
confidence: 99%
“…Following a useful practice for PMS algorithms, Pathak et al [13] evaluated their algorithm on certain instances that are considered challenging for PMS: (9,2), (11,3), (13,4) and so on [1], and are generated as follows: n = 20 random DNA/protein strings of length m = 600, and a short random string M of length l are generated according to the independent identically distributed (i.i.d) model. A separate random d hamming distance neighbor of M is "planted" in each of the n input strings.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations