2018 Data Compression Conference 2018
DOI: 10.1109/dcc.2018.00025
|View full text |Cite
|
Sign up to set email alerts
|

Lossy Compression of Quality Scores in Differential Gene Expression: A First Assessment and Impact Analysis

Abstract: High-throughput sequencing of RNA molecules has enabled the quantitative analysis of gene expression at the expense of storage space and processing power. To alleviate these problems, lossy compression methods of the quality scores associated to RNA sequencing data have recently been proposed, and the evaluation of their impact on downstream analyses is gaining attention. In this context, this work presents a first assessment of the impact of lossily compressed quality scores in RNA sequencing data on the perf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…It has been shown and confirmed at length that lossy approaches for quality score representation provide significant storage saving with negligible impact on variant calling (Hsi-Yang Fritz et al, 2011;Hach et al, 2012;Jones et al, 2012;Greenfield et al, 2016;Ochoa et al, 2016;Roguski et al, 2018), regardless of the idiosyncrasies of the lossy compression approach. The effect of lossy compression of quality scores has also been explored in differential gene expression with similar conclusions on the negligible effect of applying lossy representations (Hernandez-Lopez et al, 2018). Furthermore, recent advances in sequencing technologies are leading the production of longer genomic sequences with better accuracy and drastically reduced resolution for the quality scores (Illumina, 2017), supporting the claim that coarser representations are in principle suitable for omic analyses.…”
Section: Introductionmentioning
confidence: 82%
See 1 more Smart Citation
“…It has been shown and confirmed at length that lossy approaches for quality score representation provide significant storage saving with negligible impact on variant calling (Hsi-Yang Fritz et al, 2011;Hach et al, 2012;Jones et al, 2012;Greenfield et al, 2016;Ochoa et al, 2016;Roguski et al, 2018), regardless of the idiosyncrasies of the lossy compression approach. The effect of lossy compression of quality scores has also been explored in differential gene expression with similar conclusions on the negligible effect of applying lossy representations (Hernandez-Lopez et al, 2018). Furthermore, recent advances in sequencing technologies are leading the production of longer genomic sequences with better accuracy and drastically reduced resolution for the quality scores (Illumina, 2017), supporting the claim that coarser representations are in principle suitable for omic analyses.…”
Section: Introductionmentioning
confidence: 82%
“…domains (Alberti et al, 2016;Ochoa et al, 2016;Hernandez-Lopez et al, 2018) without clear consensus on the limits of safe lossy distortion levels. Meanwhile, the increasing complexity of genomic assays, data sets, and computational methods only adds to the difficulty of potential quantification.…”
mentioning
confidence: 99%