2013
DOI: 10.1007/978-1-62703-447-0_4
|View full text |Cite
|
Sign up to set email alerts
|

Managing Large SNP Datasets with SNPpy

Abstract: Using relational databases to manage SNP datasets is a very useful technique that has significant advantages over alternative methods, including the ability to leverage the power of relational databases to perform data validation, and the use of the powerful SQL query language to export data. SNPpy is a Python program which uses the PostgreSQL database and the SQLAlchemy Python library to automate SNP data management. This chapter shows how to use SNPpy to store and manage large datasets.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2016
2016

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…However, there are a few other software packages available that could possibly also be used for the intended purpose of TheSNPpit. SNPpy was already published in 2011 with an update in 2013 [5, 15], while dbVOR is much more recent from 2015 [6]. The design of the packages is quite different from TheSNPpit.…”
Section: Discussionmentioning
confidence: 99%
“…However, there are a few other software packages available that could possibly also be used for the intended purpose of TheSNPpit. SNPpy was already published in 2011 with an update in 2013 [5, 15], while dbVOR is much more recent from 2015 [6]. The design of the packages is quite different from TheSNPpit.…”
Section: Discussionmentioning
confidence: 99%
“…We have compared dbVOR to SNPpy, a recently developed database system [ 6 , 7 ]. Both SNPpy and dbVOR are database systems that store and retrieve genotype information from experiments.…”
Section: Discussionmentioning
confidence: 99%
“…While several database systems have been developed for managing genetic data[ 1 - 7 ], when we tried some of these, we found that some relied on commercial database systems that were so complicated that they required a database administrator to routinely maintain and apply regular security updates. Others did not scale well as the numbers of markers genotyped per experiment rapidly increased.…”
Section: Introductionmentioning
confidence: 99%