2020
DOI: 10.1093/database/baz158
|View full text |Cite
|
Sign up to set email alerts
|

A negative storage model for precise but compact storage of genetic variation data

Abstract: Falling sequencing costs and large initiatives are resulting in increasing amounts of data available for investigator use. However, there are informatics challenges in being able to access genomic data. Performance and storage are well-appreciated issues, but precision is critical for meaningful analysis and interpretation of genomic data. There is an inherent accuracy vs. performance trade-off with existing solutions. The most common approach (Variant-only Storage Model, VOSM) stores only variant data. System… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…A base was considered sufficiently covered if the depth of coverage was ≥ 14 in tumor sample and ≥ 8 in normal samples as has been previously described: https://www.synapse.org/#!Synapse:syn1695394 . The fraction of each gene’s protein coding bases (using the longest RefSeq transcript) covered by sufficient sequence data was calculated for each sample using the Negative Storage Model [ 13 ]. Gene-level depth of coverage was then determined by calculating the number of bases sufficiently covered by sequencing for each of the RefSeq coding genes (with 25 base-pair flanking regions).…”
Section: Methodsmentioning
confidence: 99%
“…A base was considered sufficiently covered if the depth of coverage was ≥ 14 in tumor sample and ≥ 8 in normal samples as has been previously described: https://www.synapse.org/#!Synapse:syn1695394 . The fraction of each gene’s protein coding bases (using the longest RefSeq transcript) covered by sufficient sequence data was calculated for each sample using the Negative Storage Model [ 13 ]. Gene-level depth of coverage was then determined by calculating the number of bases sufficiently covered by sequencing for each of the RefSeq coding genes (with 25 base-pair flanking regions).…”
Section: Methodsmentioning
confidence: 99%
“…A base was considered sufficiently covered if the depth of coverage was ≥ 14 in tumor sample and ≥ 8 in normal samples (as has been previously described: https:// www.synapse.org/#!Synapse:syn1695394, accessed on 11 March 2016). The fraction of each gene's protein coding bases (using the longest RefSeq transcript) covered by sufficient sequence data was calculated for each sample using the Negative Storage Model [22].…”
Section: Gene List Acquisitionmentioning
confidence: 99%