2002
DOI: 10.1002/prot.10078
|View full text |Cite
|
Sign up to set email alerts
|

Structural genomics analysis: Characteristics of atypical, common, and horizontally transferred folds

Abstract: We conducted a structural genomics analysis of the folds and structural superfamilies in the first 20 completely sequenced genomes by focusing on the patterns of fold usage and trying to identify structural characteristics of typical and atypical folds. We assigned folds to sequences using PSI-blast, run with a systematic protocol to reduce the amount of computational overhead. On average, folds could be assigned to about a fourth of the ORFs in the genomes and about a fifth of the amino acids in the proteomes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
31
0

Year Published

2003
2003
2009
2009

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(32 citation statements)
references
References 91 publications
(109 reference statements)
1
31
0
Order By: Relevance
“…Out of 138,377 entries, an average of ‫5.1ע4.83‬ (SE) % (range 9.7%-48.6%) matched SCOP domains, and ‫8.0ע5.91‬ % (range 9.5%-28.3%) had enzymatic activities associated with them. For comparison purposes, we also used a data set that matched 420 fold categories in SCOP 1.39 and was generated by PSI-BLAST comparisons between PDB and genome sequence entries in the first 20 genomes ever to be sequenced (Hegyi et al 2002). We considered soluble proteins that grouped into major structural classes: all-␣ proteins with structures composed mostly of ␣-helices (␣), all-␤ proteins with mostly ␤-sheets (␤), ␣/␤ proteins with interspersed ␣-helices and ␤-sheets (␣/␤), ␣+␤ proteins containing segregated ␣-helices and ␤-sheet regions (␣+␤), multidomain proteins containing domains belonging to different classes and without known homologs (M), and small proteins (S).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Out of 138,377 entries, an average of ‫5.1ע4.83‬ (SE) % (range 9.7%-48.6%) matched SCOP domains, and ‫8.0ע5.91‬ % (range 9.5%-28.3%) had enzymatic activities associated with them. For comparison purposes, we also used a data set that matched 420 fold categories in SCOP 1.39 and was generated by PSI-BLAST comparisons between PDB and genome sequence entries in the first 20 genomes ever to be sequenced (Hegyi et al 2002). We considered soluble proteins that grouped into major structural classes: all-␣ proteins with structures composed mostly of ␣-helices (␣), all-␤ proteins with mostly ␤-sheets (␤), ␣/␤ proteins with interspersed ␣-helices and ␤-sheets (␣/␤), ␣+␤ proteins containing segregated ␣-helices and ␤-sheet regions (␣+␤), multidomain proteins containing domains belonging to different classes and without known homologs (M), and small proteins (S).…”
Section: Methodsmentioning
confidence: 99%
“…Uniform distribution patterns are suggestive of common ancestry and longterm architectural stability, and are spread by vertical descent (Hegyi et al 2002). Uneven distribution patterns are suggestive of horizontal gene transfer (HGT), gene loss, convergence, and rapid divergence (Eisen 2000).…”
Section: Distribution Of Protein Folds Across Domainsmentioning
confidence: 99%
“…These threading assignment results are quite similar to that of the PDB benchmark, with a slightly larger portion of targets assigned to the easy set in E. coli, which may be due to the fact that homologues are not excluded. It should also be mentioned that genome scale structure predictions have been performed by many authors on different organisms (21)(22)(23)(24)(25)(26)(27). Most are based on homology modeling or sequence comparison techniques, which require solved homologous structures.…”
Section: Figmentioning
confidence: 99%
“…Protein folds are among the most conserved components in nature, making them good candidates for the study of distant evolutionary relationships. Folds were surveyed in a number of genomes (Gerstein and Levitt 1997;Gerstein 1997Gerstein , 1998Frishman and Mewes 1997;Wolf et al 1999;Hegyi et al 2002) and indexed in several databases (Lee et al 2003). Fold composition, measured as presence-absence of individual folds, was used to reconstruct whole-genome trees based on the idea that closely related organisms must share significantly more fold architectures than distantly related ones (Gerstein 1998;Wolf et al 1999;Lin and Gerstein 2000).…”
Section: Introductionmentioning
confidence: 99%