2016
DOI: 10.1186/s12859-016-1130-6
|View full text |Cite
|
Sign up to set email alerts
|

RefSelect: a reference sequence selection algorithm for planted (l, d) motif search

Abstract: BackgroundThe planted (l, d) motif search (PMS) is an important yet challenging problem in computational biology. Pattern-driven PMS algorithms usually use k out of t input sequences as reference sequences to generate candidate motifs, and they can find all the (l, d) motifs in the input sequences. However, most of them simply take the first k sequences in the input as reference sequences without elaborate selection processes, and thus they may exhibit sharp fluctuations in running time, especially for large a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…The current PMP problem algorithms are classed into two groups such as the exact and approximation algorithms. Exact PMP algorithms, which are widely used include qPMS series [4], RefSelect [5], PairMOtif series [6], RecMotif [7], and iTriplet [8]. All kinds of exact algorithms require substantial computational effort if ℓ and d are quite high.…”
Section: Introductionmentioning
confidence: 99%
“…The current PMP problem algorithms are classed into two groups such as the exact and approximation algorithms. Exact PMP algorithms, which are widely used include qPMS series [4], RefSelect [5], PairMOtif series [6], RecMotif [7], and iTriplet [8]. All kinds of exact algorithms require substantial computational effort if ℓ and d are quite high.…”
Section: Introductionmentioning
confidence: 99%
“…x B i is a binary descriptor vector for the ith training molecule, w B and w N B are weight vectors corresponding to x B and x N B , respectively, L is the number of a training molecules, and k B (·) is a binary kernel function.Rather than using a single kernel or linear regression, MultiDK utilizes multiple kernels such as a nonlinear binary kernel for binary descriptors and linear processing for non-binary descriptors separately. To optimize a kernel function[70][71][72] , multiple combinatorial kernels have been used in various applications including biomedical data73 and YouTube video data74,75 . Here, we use a multiple kernel approach to apply appropriate kernels for different features instead of training the kernel.…”
mentioning
confidence: 99%