2012
DOI: 10.1371/journal.pone.0052249
|View full text |Cite
|
Sign up to set email alerts
|

FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads

Abstract: The presence of duplicates introduced by PCR amplification is a major issue in paired short reads from next-generation sequencing platforms. These duplicates might have a serious impact on research applications, such as scaffolding in whole-genome sequencing and discovering large-scale genome variations, and are usually removed. We present FastUniq as a fast de novo tool for removal of duplicates in paired short reads. FastUniq identifies duplicates by comparing sequences between read pairs and does not requir… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
361
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 497 publications
(363 citation statements)
references
References 29 publications
2
361
0
Order By: Relevance
“…Raw FASTQ read files were evaluated using FastQC (v0.11.2) and then trimmed using Trimmomatic (v0.32) 68 to remove adaptor read-through, low-quality bases, and ambiguous base calls. All jumping matepair DNA libraries were processed using the program FastUniq (v1.1) 69 to remove duplicate read pairs. The N. clavipes genome was assembled de novo using a meta-assembly approach.…”
Section: Methodsmentioning
confidence: 99%
“…Raw FASTQ read files were evaluated using FastQC (v0.11.2) and then trimmed using Trimmomatic (v0.32) 68 to remove adaptor read-through, low-quality bases, and ambiguous base calls. All jumping matepair DNA libraries were processed using the program FastUniq (v1.1) 69 to remove duplicate read pairs. The N. clavipes genome was assembled de novo using a meta-assembly approach.…”
Section: Methodsmentioning
confidence: 99%
“…For all PH207 genomic libraries, with the exception of TruSeq synthetic long-reads, PCR duplicates were removed using FastUniq software (Xu et al, 2012). The Illumina HiSeq 2000 adaptor AGATCGGAAGAGC was removed, and reads were error corrected using the Corrector_HA module of SOAPdenovo (using kmer size 23 and cutoff of 6) .…”
Section: Read Preprocessing and Error Correctionmentioning
confidence: 99%
“…Prior to assembly, duplicate read pairs were removed from each dataset using FastUniq [39] and the order of the remaining unique reads randomized using fastqsort [36]. Only reads more than 31 bp were used for assembly, which corresponded to the k-mer size used for baiting.…”
Section: (E) Assembly Of Mitogenome Sequencesmentioning
confidence: 99%