2017
DOI: 10.1186/s12859-017-1560-9
|View full text |Cite
|
Sign up to set email alerts
|

String kernels for protein sequence comparisons: improved fold recognition

Abstract: BackgroundThe amino acid sequence of a protein is the blueprint from which its structure and ultimately function can be derived. Therefore, sequence comparison methods remain essential for the determination of similarity between proteins. Traditional approaches for comparing two protein sequences begin with strings of letters (amino acids) that represent the sequences, before generating textual alignments between these strings and providing scores for each alignment. When the similitude between the two protein… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
11
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 61 publications
2
11
0
Order By: Relevance
“…The weighted string kernel considered here, referred to as WSeqKernel, is inspired by the convolution string kernels introduced by D. Haussler [ 36 ], the local alignment kernel presented by Saigo et al [ 27 ], and the string kernel of Smale and co-workers [ 28 ]. An unweighted version was presented in details in Nojoomi and Koehl [ 29 ]. We provide here the key elements of its construction, emphasizing the differences with those kernels.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…The weighted string kernel considered here, referred to as WSeqKernel, is inspired by the convolution string kernels introduced by D. Haussler [ 36 ], the local alignment kernel presented by Saigo et al [ 27 ], and the string kernel of Smale and co-workers [ 28 ]. An unweighted version was presented in details in Nojoomi and Koehl [ 29 ]. We provide here the key elements of its construction, emphasizing the differences with those kernels.…”
Section: Methodsmentioning
confidence: 99%
“… is the sequence kernel considered in this paper. Following [ 28 , 29 , 36 ], we make the following remarks: The input kernel matrix G is not a traditional substitution matrix, as it does not involve applying the logarithm function on the probability measures. While the latter is needed to make scores additive, a necessary condition to enable the use of dynamic programming algorithms to generate pairwise sequence alignment, it is not needed for the string kernel we use here.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…They are, however, computationally intensive to evaluate. With current databases of biological sequences at the order of hundreds of gigabytes, alternatives have been proposed both as faster, heuristic algorithms and as easier to compute similarity measures [32,33,3,25,11].…”
Section: Introductionmentioning
confidence: 99%