2004
DOI: 10.1093/bioinformatics/btg392
|View full text |Cite
|
Sign up to set email alerts
|

Comparative evaluation of word composition distances for the recognition of SCOP relationships

Abstract: All MATLAB code used to generate the data is available upon request to the authors. Additional material available at http://bioinformatics.musc.edu/wmetric

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
41
0

Year Published

2006
2006
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(43 citation statements)
references
References 26 publications
2
41
0
Order By: Relevance
“…The W-metric (Vinga et al, 2004) weighs differences between all pairs of amino acids by their entries in matrix W. Here, we use BLOSUM62 (Henikoff and Henikoff, 1992).…”
Section: Alignment-free Methodsmentioning
confidence: 99%
“…The W-metric (Vinga et al, 2004) weighs differences between all pairs of amino acids by their entries in matrix W. Here, we use BLOSUM62 (Henikoff and Henikoff, 1992).…”
Section: Alignment-free Methodsmentioning
confidence: 99%
“…Vinga et al (2004) found their results virtually the same for different matrices (BLOSUM62, BLOSUM50, BLOSUM40 and PAM250); we use BLOSUM62 ( Henikoff and Henikoff 1992). …”
Section: Previous Workmentioning
confidence: 89%
“…Vinga et al (2004) introduced the W-metric which they categorise as “word-based” but we note that for amino acid sequences (to which it was applied originally), it effectively operates on 1-mers only:…”
Section: Previous Workmentioning
confidence: 99%
“…King and Guda [48] presented an n-gram-based Bayesian classifier that predicts the localization of a protein sequence. The classification accuracy of n-gram composition metrics was reviewed by Vinga and Almeida [8,49], together with a new definition of distance between protein sequences. Volkovich et al [50] applied n-grams to the classification of DNA sequences considered as text over the four-letter alphabet {A, C, G, T}.…”
Section: N-gramsmentioning
confidence: 99%