1997
DOI: 10.1002/pro.5560060319
|View full text |Cite
|
Sign up to set email alerts
|

Embedding strategies for effective use of information from multiple sequence alignments

Abstract: We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
49
0

Year Published

1997
1997
2011
2011

Publication Types

Select...
10

Relationship

2
8

Authors

Journals

citations
Cited by 67 publications
(50 citation statements)
references
References 43 publications
1
49
0
Order By: Relevance
“…The Cobbler consensus sequence (Henikoff and Henikoff 1997) of the Motif blocks identified were used to search the GenBank database (Henikoff and Henikoff 1994). To generate a tree, all sequences identified (those in Fig.…”
Section: Phylogenetic Analysismentioning
confidence: 99%
“…The Cobbler consensus sequence (Henikoff and Henikoff 1997) of the Motif blocks identified were used to search the GenBank database (Henikoff and Henikoff 1994). To generate a tree, all sequences identified (those in Fig.…”
Section: Phylogenetic Analysismentioning
confidence: 99%
“…A pairwise amino acid similarity matrix s(i, j) [e.g., BLOSUM, PAM (105,106)] is often used to assess amino acid matching. Given two sequences to be aligned, the global similarity between the two protein sequences is scored as follows.…”
Section: The Significant Segment Pair Alignment (Sspa) Protocol (14)mentioning
confidence: 99%
“…Among these, sequence-based methods (2) to recognize homologs are well developed, but sensitivity falters as sequence similarity sinks into the ''twilight zone,'' a threshold near 30% sequence identity (3). Sensitivity can be extended by using information from multiple aligned sequence families (4,5), local multiple alignment of blocks (6)(7)(8)(9), and structurebased fold recognition such as threading (ref. 10 and references therein) and profiles (11).…”
mentioning
confidence: 99%