2017
DOI: 10.1371/journal.pone.0175310
|View full text |Cite
|
Sign up to set email alerts
|

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata

Abstract: The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associa… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1

Relationship

3
5

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…The design of the 4DN data portal infrastructure (Fig. 5 ), originally based on the ENCODE infrastructure, includes the following components: (1) A postgres database storing metadata in json format, first developed in ENCODE; (2) The python pyramid framework for the database known as SnoVault 52 , first developed in ENCODE but further tailored and developed by the 4DN DCIC ( https://github.com/4dn-dcic/snovault ); (3) The FourFront front-end ( https://github.com/4dn-dcic/fourfront ), originally based on EncodeD 52 from ENCODE, but engineered by 4DN DCIC to feature a data model for representing diverse datasets, and includes a modern front-end with reactJS to provide a responsive user experience; (4) Elasticsearch that provides fast and efficient search with various metadata fields by indexing all items and formatting them for retrieval; (5) AWS S3 used for file storage, enabling all public data files to be accessed via the data portal interface; (6) A RESTful API underlying the infrastructure, through which all metadata in the portal can be accessed.
Fig.
…”
Section: Methodsmentioning
confidence: 99%
“…The design of the 4DN data portal infrastructure (Fig. 5 ), originally based on the ENCODE infrastructure, includes the following components: (1) A postgres database storing metadata in json format, first developed in ENCODE; (2) The python pyramid framework for the database known as SnoVault 52 , first developed in ENCODE but further tailored and developed by the 4DN DCIC ( https://github.com/4dn-dcic/snovault ); (3) The FourFront front-end ( https://github.com/4dn-dcic/fourfront ), originally based on EncodeD 52 from ENCODE, but engineered by 4DN DCIC to feature a data model for representing diverse datasets, and includes a modern front-end with reactJS to provide a responsive user experience; (4) Elasticsearch that provides fast and efficient search with various metadata fields by indexing all items and formatting them for retrieval; (5) AWS S3 used for file storage, enabling all public data files to be accessed via the data portal interface; (6) A RESTful API underlying the infrastructure, through which all metadata in the portal can be accessed.
Fig.
…”
Section: Methodsmentioning
confidence: 99%
“…Each NGS run included 6 pooled libraries loaded into one NextSeq High Output cartridge (300 Cycles; Illumina). Paired-end sequencing was performed on a NextSeq 500 system (Illumina) with 152 cycles (76 bp PE sequencing) following Encode Project protocol for best RNA Seq data (12). Individual microdissected areas inside each sample are unique to that section and therefore not repeatable through biological replicates.…”
Section: Methodsmentioning
confidence: 99%
“…New data types, pipelines, metadata properties and ways of interacting with the data are continually being devised and improved upon. The Portal has seen many revisions since its launch in 2013 with new software releases occurring on average every three weeks ( 14 ). The release version can be found in the lower left corner of the Portal.…”
Section: The Portalmentioning
confidence: 99%