2018
DOI: 10.1093/nar/gky1130
|View full text |Cite
|
Sign up to set email alerts
|

The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver

Abstract: Here, we present a major update to the SUPERFAMILY database and the webserver. We describe the addition of new SUPERFAMILY 2.0 profile HMM library containing a total of 27 623 HMMs. The database now includes Superfamily domain annotations for millions of protein sequences taken from the Universal Protein Recourse Knowledgebase (UniProtKB) and the National Center for Biotechnology Information (NCBI). This addition constitutes about 51 and 45 million distinct protein sequences obtained from UniProtKB and NCBI re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
118
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 158 publications
(119 citation statements)
references
References 24 publications
1
118
0
Order By: Relevance
“…Step 1 Downloading LPIs, lncRNA sequences, and protein sequences from NPInter (Hao et al, 2016), NONCODE (Zhao et al, 2015), and SUMPERFAMILY (Pandurangan et al, 2018), respectively.…”
Section: Sfpel-lpimentioning
confidence: 99%
“…Step 1 Downloading LPIs, lncRNA sequences, and protein sequences from NPInter (Hao et al, 2016), NONCODE (Zhao et al, 2015), and SUMPERFAMILY (Pandurangan et al, 2018), respectively.…”
Section: Sfpel-lpimentioning
confidence: 99%
“…Database analysis on PlasmoDB (Aurrecoechea et al, 2009) revealed PF3D7_0925900 as the only gene in the genome of P. falciparum 3D7 containing a calycin and lipocalin superfamily signature (Mitchell et al, 2019;Pandurangan et al, 2019). The encoded protein of 217 amino acids length, which we term P. falciparum lipocalin (PfLCN), has an unknown function.…”
Section: Identification and Evolutionary Analysis Of Lipocalin-like Pmentioning
confidence: 99%
“…In contrast, low pairwise identities of 23% to 27% were observed for lipocalin-like proteins in Toxoplasma gondii (GenBank CEL71535), Neospora caninum (GenBank XP_003879719) and Hammondia hammondi (GenBank KEP62611), and blastp searches did not yield any significant sequence alignments to species outside the apicomplexan lineage, where overall pairwise sequence identities to known lipocalin proteins dropped below 20%, making it difficult to detect more distant homologous proteins at the sequence level. Therefore, we relied upon annotations based on libraries of protein signatures provided by the SUPERFAMILY and other databases integrated in InterPro to identify additional putative lipocalins (Mitchell et al, 2019;Pandurangan et al, 2019). Interestingly, when we did a comparative analysis of the genomes of other apicomplexan parasites and their closest free-living non-parasitic relatives for the presence of proteins belonging to the lipocalin (SSF50814 SUPERFAMILY identifier) and calycin (IPR012674, InterPro identifier) homologous superfamily, we found that most parasitic species encode one to three putative lipocalin-like genes only, while in most of the free living relatives a substantially higher number of putative lipocalin-like genes can be found even when different genome sizes are taken into account ( Figure S2).…”
Section: Identification and Evolutionary Analysis Of Lipocalin-like Pmentioning
confidence: 99%
See 2 more Smart Citations