2020
DOI: 10.1093/bioinformatics/btaa983
|View full text |Cite
|
Sign up to set email alerts
|

pdm_utils: a SEA-PHAGES MySQL phage database management toolkit

Abstract: Summary Bacteriophages (phages) are incredibly abundant and genetically diverse. The volume of phage genomics data is rapidly increasing, driven in part by the SEA-PHAGES program, which isolates, sequences, and manually annotates hundreds of phage genomes each year. With an ever-expanding genomics dataset, there are many opportunities for generating new biological insights through comparative genomic and bioinformatic analyses. As a result, there is a growing need to be able to store, update,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…MySQL database management system is characterized by small storage volume, fast query speed, and low development cost [ 23 ]. Database and its application system are the core of mobile intelligent medical system [ 24 ].…”
Section: Methodsmentioning
confidence: 99%
“…MySQL database management system is characterized by small storage volume, fast query speed, and low development cost [ 23 ]. Database and its application system are the core of mobile intelligent medical system [ 24 ].…”
Section: Methodsmentioning
confidence: 99%
“…MMseqs2 and Clustal Omega must be installed separately, using a package manager, such as Anaconda or compiled manually; PhaMMseqs installation and usage instructions are available in the GitHub repository. PhaMMseqs has also been incorporated into the pdm_utils package ( Mavrich et al 2021 ) for creating and maintaining phage genome databases, replacing its “phamerate” pipeline (i.e. the prior system for pham assembly).…”
Section: Methodsmentioning
confidence: 99%
“…Phage gene sequences are first translated into amino acid sequences, and PhaMMseqs uses MMseqs2 to first derive sequence profiles, followed by profile-sequence clustering to merge phams containing more remote homologs; Clustal Omega ( Sievers et al 2011 ) is used to construct pham multiple sequence alignments (MSAs). PhaMMseqs can readily assemble Phams from large genome datasets (>500,000 genes) on modest hardware and can be used in combination with database management utilities like pdm_utils ( Mavrich et al 2021 ) for efficiently updating phams as the genome annotation landscape changes.…”
Section: Introductionmentioning
confidence: 99%
“…The PhagesDB database [ 21 ] is also linked to a second database used by the Phamerator package [ 23 ], which provides several key functionalities, including genome comparisons. A software package ‘pdm_utils’ provides the tools for coordinating data between these and GenBank, and for extracting data [ 24 ].…”
Section: Advances In Understanding Phage Diversitymentioning
confidence: 99%