2020
DOI: 10.1093/nar/gkaa913
|View full text |Cite
|
Sign up to set email alerts
|

Pfam: The protein families database in 2021

Abstract: The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

9
2,954
0
5

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 4,289 publications
(3,396 citation statements)
references
References 27 publications
9
2,954
0
5
Order By: Relevance
“…Domain annotations for one representative bacterial CSP and human CSD-containing proteins according to SMART [ 27 ]. For CSDE1, Pfam [ 31 ] agrees with the domain annotation shown here. Uniprot [ 22 ] annotates two additional CSDs in CSDE1, one between CSD3 and CSD4 and one between CSD4 and CSD5, as well as two additional truncated CSDs, one between CSD1 and CSD2 and one between CSD2 and CSD3.…”
Section: Figuresupporting
confidence: 70%
“…Domain annotations for one representative bacterial CSP and human CSD-containing proteins according to SMART [ 27 ]. For CSDE1, Pfam [ 31 ] agrees with the domain annotation shown here. Uniprot [ 22 ] annotates two additional CSDs in CSDE1, one between CSD3 and CSD4 and one between CSD4 and CSD5, as well as two additional truncated CSDs, one between CSD1 and CSD2 and one between CSD2 and CSD3.…”
Section: Figuresupporting
confidence: 70%
“…Domains were predicted using the same Hmmsearch procedure against the Pfam database (version 33.0) 49 . SIGNALP (version 5.0) was run to predict the putative cellular localization of the proteins using the parameters -org arch in archaeal genomes and -org gram+ in bacterial genomes 50 .…”
Section: Functional Annotationmentioning
confidence: 99%
“…66,67 We employed Kullback-Leibler (KL) sequence conservation score KLConsScore using MSA profiles generated by hidden Markov models in Pfam database for the SARS-CoV S glycoproteins. 68,69 Three Pfam domains were utilized corresponding to S1, the NTD (bCoV_S1_N, Betacoronavirus-like spike glycoprotein S1, Nterminal, Pfam:PF16451, Uniprot SPIKE_CVHSA, pdb id 6CS0, residues 33-324), the RBD The KL conservation is calculated according to the following formula:…”
Section: Sequence Conservation and Coevolutionary Analysesmentioning
confidence: 99%