2015
DOI: 10.1093/nar/gkv1271
|View full text |Cite
|
Sign up to set email alerts
|

PSORTdb: expanding the bacteria and archaea protein subcellular localization database to better reflect diversity in cell envelope structures

Abstract: Protein subcellular localization (SCL) is important for understanding protein function, genome annotation, and has practical applications such as identification of potential vaccine components or diagnostic/drug targets. PSORTdb (http://db.psort.org) comprises manually curated SCLs for proteins which have been experimentally verified (ePSORTdb), as well as pre-computed SCL predictions for deduced proteomes from bacterial and archaeal complete genomes available from NCBI (cPSORTdb). We now report PSORTdb 3.0. I… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
53
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 82 publications
(57 citation statements)
references
References 30 publications
3
53
0
Order By: Relevance
“…Using the PSORT algorithm, 31 putative outer membrane proteins were predicted to be encoded in the genome, which is similar to the average number predicted for other Gram-negatives (34,8 according to PSORTdb30). LC-MS analysis of the P. limnophila proteome membrane fraction confirmed 77% of the predicted proteins (24 in total, Supplementary Table 2).…”
Section: Resultssupporting
confidence: 55%
“…Using the PSORT algorithm, 31 putative outer membrane proteins were predicted to be encoded in the genome, which is similar to the average number predicted for other Gram-negatives (34,8 according to PSORTdb30). LC-MS analysis of the P. limnophila proteome membrane fraction confirmed 77% of the predicted proteins (24 in total, Supplementary Table 2).…”
Section: Resultssupporting
confidence: 55%
“…We downloaded from the database PSORTb version 3.00 (64), which contains predicted subcellular localization for bacterial and archeal genomes, the full database tables for gram-positive and gram-negative bacteria. Based on RefSeq accession numbers, localization information could be assigned to 1 442 202 protein sequences out of 5 125 116 in our database.…”
Section: Methodsmentioning
confidence: 99%
“…There are some reported methods for extracting negative datasets, such as: (1) Negative datasets are constructed by using random pairs which exclude the experimentally detected interactions [1], and as there are discordant numbers between high-confidence interactions and random pairs, the scale and structure of networks should be balanced between negative and positive datasets. This method may include undetected PPIs; (2) Negative examples are chosen based on the categories of their distinct functions, such as sub-cellular localization (can be accessed by tools such as LOCATE [38], PSORTdb 3.0 [39], LocDB [40]) and annotations (such as KEGG pathways, gene ontology (GO), and Enzyme Commission (EC)) [22,41]. However, these methods can also lead to biases due to varying definitions of categories [42]; (3) Another alternative approach is based on topological policy: choose pairs of separated proteins in existing PPI networks to represent non-interactions: defining negative samples as the protein pairs with the shortest path lengths exceed the median shortest paths in a GSP network [43], or further construct a GSN network based on the principle of keeping the composition and degree of a node identical to the GSP network [20].…”
Section: Defining Gold Standard Datasetsmentioning
confidence: 99%