2004
DOI: 10.1093/nar/gkh121
|View full text |Cite
|
Sign up to set email alerts
|

The Pfam protein families database

Abstract: Pfam is a large collection of protein families and domains. Over the past 2 years the number of families in Pfam has doubled and now stands at 6190 (version 10.0). Methodology improvements for searching the Pfam collection locally as well as via the web are described. Other recent innovations include modelling of discontinuous domains allowing Pfam domain definitions to be closer to those found in structure databases. Pfam is available on the web in the UK (http://www.sanger.ac.uk/Software/Pfam/), the USA (htt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
1,456
0
4

Year Published

2005
2005
2018
2018

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 3,192 publications
(1,500 citation statements)
references
References 0 publications
5
1,456
0
4
Order By: Relevance
“…The conserved domains of inserts were analysed using the CD search module in NCBI (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Their structure and putative function were annotated based on similarities to the sequences in the Clusters of Orthologous Groups (COG) [21], Protein Families (Pfam) [22] (http://xfam.org/), and Blocks [23] (InterPro http://www.ebi.ac.uk/interpro/) databases and based on the results of BLAST searches in UniProt (http://www.uniprot.org/). If the similarity of the protein sequence alignment was less than 30%, it was considered to be an unknown sequence.…”
Section: Methodsmentioning
confidence: 99%
“…The conserved domains of inserts were analysed using the CD search module in NCBI (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Their structure and putative function were annotated based on similarities to the sequences in the Clusters of Orthologous Groups (COG) [21], Protein Families (Pfam) [22] (http://xfam.org/), and Blocks [23] (InterPro http://www.ebi.ac.uk/interpro/) databases and based on the results of BLAST searches in UniProt (http://www.uniprot.org/). If the similarity of the protein sequence alignment was less than 30%, it was considered to be an unknown sequence.…”
Section: Methodsmentioning
confidence: 99%
“…Community driven databases such as GNPS (Wang et al, 2016) provide a good platform for researchers to contribute to the growth of spectral library knowledge, much like how nucleotide or protein databases were established in recent years (Bateman et al, 2004;Sayers et al, 2012).…”
Section: Tandem Mass Spectrometry and Spectral Networkingmentioning
confidence: 99%
“…Single-copy gene analysis was performed to infer biogeographical patterns by (1) selecting 47 conserved single-copy gene families in isolate genomes in the Integrated Microbial Genomes (IMG) database (Markowitz et al 2006) using PFAM (Bateman et al 2004) profile searches with rps-BLAST (Altschul et al 1997), (2) identifying members of these families in the bacterial sludge metagenomes, (3) aligning each family with ClustalX ( Thompson et al 1994), and (4) generating neighbor-joining trees using ClustalX. See Supplemental Research Data for details.…”
Section: Bioinformatic Analysesmentioning
confidence: 99%