2002
DOI: 10.1074/mcp.m200032-mcp200
|View full text |Cite
|
Sign up to set email alerts
|

Abundance and Distributions of Eukaryote Protein Simple Sequences

Abstract: Protein simple sequences are a subclass of low complexity regions of sequence that are highly enriched in one or a few residue types. Such sequences are common in transcription regulatory proteins, in structural proteins, in proteins involved in nucleic acid interactions, and in mediating protein-protein interactions. Simple sequences of 10 or more residues, containing >50% of a single residue type are surveyed in this work. Both eukaryote and prokaryote proteomes are investigated with emphasis on the eukaryot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

4
46
1

Year Published

2004
2004
2013
2013

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 45 publications
(51 citation statements)
references
References 40 publications
4
46
1
Order By: Relevance
“…Serine-rich regions are present in most CAP-Gly proteins and have been proposed to be linkers between the different domains of the protein, sites of regulation by phosphorylation, or additional tubulin binding domains (28,34,41). In the CLIP family of proteins, these regions are poorly conserved at the primary sequence level and have characteristics of unstructured sequences (34).…”
Section: Discussionmentioning
confidence: 99%
“…Serine-rich regions are present in most CAP-Gly proteins and have been proposed to be linkers between the different domains of the protein, sites of regulation by phosphorylation, or additional tubulin binding domains (28,34,41). In the CLIP family of proteins, these regions are poorly conserved at the primary sequence level and have characteristics of unstructured sequences (34).…”
Section: Discussionmentioning
confidence: 99%
“…4, c-f). Such simple sequences are common in both eukaryotic and prokaryotic genomes (113), and it will be interesting to determine how they affect protein homeostasis. It has been shown that simple sequences play an important role in the regulation of two eukaryotic transcription factors by leading to their partial degradation by the proteasome (40).…”
Section: Conserved Protein Unfolding Mechanism Among Atp-dependentmentioning
confidence: 99%
“…Thus, there are about 320,000 low-complexity sequences that cannot be accurately compared or aligned and therefore cannot be compared on any large scale, either functionally or evolutionarily. In addition, there are low-complexity segments in half of all proteins (32). These segments also cannot be reliably aligned and so are currently "masked" by SEG or similar procedures and then ignored by the alignment tools (29,34).…”
mentioning
confidence: 99%
“…These proteins are rich in a few amino acids and thus have overall composition significantly different from the "average" compositions seen in the multiple alignments used to construct the BLOSUM alignment scoring matrices and for the BLAST statistical analyses (16). About 10% of known protein sequences have overall low complexity; eukaryotic genomes and some bacterial pathogens contain even higher percentages of lowcomplexity sequences (24,32). The NCBI nonredundant database currently contains approximately 3.2 million sequences.…”
mentioning
confidence: 99%