2015
DOI: 10.1101/015867
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BamHash: a checksum program for verifying the integrity of sequence data

Abstract: SummaryLarge resequencing projects require a significant amount of storage for raw sequences, as well as alignment files. Since the raw sequences are redundant once the alignment has been generated, it is possible to keep only the alignment files. We present BamHash, a checksum based method to ensure that the read pairs in FASTQ files match exactly the read pairs stored in BAM files, regardless of the ordering of reads. BamHash can be used to verify the integrity of the files stored and discover any discrepanc… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 4 publications
0
1
0
Order By: Relevance
“…In the existing literature, several works specialize in the challenges imposed by the above use cases but provide only partial solutions by either addressing privacy [12], [32], [40], [41], [72], [73], or integrity [22], [37], [83], [86]. The handful of works addressing both challenges require significant modifications to existing hardware or software infrastructures.…”
Section: Introductionmentioning
confidence: 99%
“…In the existing literature, several works specialize in the challenges imposed by the above use cases but provide only partial solutions by either addressing privacy [12], [32], [40], [41], [72], [73], or integrity [22], [37], [83], [86]. The handful of works addressing both challenges require significant modifications to existing hardware or software infrastructures.…”
Section: Introductionmentioning
confidence: 99%