FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads

Xu, Haibin; Luo, Xiang; Qian, Jun; Pang, Xiaohui; Song, Jingyuan; Qian, Guang‐Rui; Chen, Jinhui; Chen, Shilin

doi:10.1371/journal.pone.0052249

Cited by 497 publications

(363 citation statements)

References 29 publications

Supporting

Mentioning

361

Contrasting

Order By: Relevance

“…Raw FASTQ read files were evaluated using FastQC (v0.11.2) and then trimmed using Trimmomatic (v0.32) 68 to remove adaptor read-through, low-quality bases, and ambiguous base calls. All jumping matepair DNA libraries were processed using the program FastUniq (v1.1) 69 to remove duplicate read pairs. The N. clavipes genome was assembled de novo using a meta-assembly approach.…”

Section: Methodsmentioning

confidence: 99%

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression

et al. 2017

View full text Add to dashboard Cite

More than 380 million years of evolution have produced >46,000 extant spider species, exhibiting an incredible diversity of silks used for prey capture and reproduction [1][2][3] . Spider silks can be stronger than steel and tougher than Kevlar, yet are much lighter weight than these manmade materials 4 . Silks vary in extensibility 5 , are temperature resilient 6 , can enable electrical conduction 7 , and can inhibit bacterial growth while being nearly invisible to the human immune system 8 . Thus, novel materials derived from spider silks offer tremendous potential for medical and industrial innovation. To take advantage of their desirable properties, we must learn more about spider silk genetic structure, functional diversity, and production.A female orb-weaving spider can have up to seven morphologically differentiated types of silk glands, each believed to extrude a distinct class of silk with biophysical characteristics resulting from the expression of a unique combination of silk genes in that gland 9,10 . The silk classes of a typical 'gluey silk' orb-weaver (Araneoidea) female include (i) major ampullate silk, which exhibits great tensile strength and is employed in draglines, bridgelines, and web radii 11,12 ; (ii) minor ampullate silk, used for inelastic temporary spirals during web building 11,12 ; (iii) cement-like piriform silk that bonds fibers together and to other substrates 13,14 ; (iv) strong, yet flexible aciniform silk used for prey wrapping and egg case insulation 15 ; (v) tubuliform and cylindriform silk that constitutes the tough outer layer of egg cases 16,17 ; (vi) flagelliform silk that exhibits unparalleled extensibility and is used in the capture spiral 18,19 ; and (vii) the viscous and sticky aggregate silk that aids in prey capture [20][21][22][23][24] . Many spider species produce just a subset of these silk classes, and some produce yet other silk types, including cribellate silk 25 . Each species possesses an assortment of specialized gland types that are thought to produce distinct classes of silks to fit specific needs 9,26,27 .Spider silks are composed primarily of spidroin proteins (where a 'spidroin' is a spider fibroin [28][29][30][31] ) that, by convention, have been named and classified according to the specific silk gland in which they were first discovered. Spidroin proteins have conserved N-and C-terminal domains that flank long runs of repeated motifs 32-34 , the composition and number of which confer specific physical properties to silks 27 . Yet, despite decades of research on orb-weaver silks, there is incomplete knowledge of all the spidroins within an orb-weaver species.Adding to the sampling of sequences obtained from targeted investigations, the assembly of the velvet spider (Stegodyphus mimosarum) genome yielded 19 spidroins, the largest collection from any single species 27 . Owing to the challenges of assembling arrays of repeats, several of the S. mimosarum spidroin sequences are incomplete, without the sequences encoding N-and C-terminal domains anchored ...

show abstract

Section: Methodsmentioning

confidence: 99%

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression

et al. 2017

View full text Add to dashboard Cite

show abstract

“…For all PH207 genomic libraries, with the exception of TruSeq synthetic long-reads, PCR duplicates were removed using FastUniq software (Xu et al, 2012). The Illumina HiSeq 2000 adaptor AGATCGGAAGAGC was removed, and reads were error corrected using the Corrector_HA module of SOAPdenovo (using kmer size 23 and cutoff of 6) .…”

Section: Read Preprocessing and Error Correctionmentioning

confidence: 99%

Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize

et al. 2016

View full text Add to dashboard Cite

“…Prior to assembly, duplicate read pairs were removed from each dataset using FastUniq [39] and the order of the remaining unique reads randomized using fastqsort [36]. Only reads more than 31 bp were used for assembly, which corresponded to the k-mer size used for baiting.…”

Section: (E) Assembly Of Mitogenome Sequencesmentioning

confidence: 99%

Tropical ancient DNA reveals relationships of the extinct Bahamian giant tortoiseChelonoidis alburyorum

et al. 2017

View full text Add to dashboard Cite

Ancient DNA of extinct species from the Pleistocene and Holocene has provided valuable evolutionary insights. However, these are largely restricted to mammals and high latitudes because DNA preservation in warm climates is typically poor. In the tropics and subtropics, non-avian reptiles constitute a significant part of the fauna and little is known about the genetics of the many extinct reptiles from tropical islands. We have reconstructed the near-complete mitochondrial genome of an extinct giant tortoise from the Bahamas (Chelonoidis alburyorum) using an approximately 1 000-year-old humerus from a water-filled sinkhole (blue hole) on Great Abaco Island. Phylogenetic and molecular clock analyses place this extinct species as closely related to Galápagos (C. niger complex) and Chaco tortoises (C. chilensis), and provide evidence for repeated overseas dispersal in this tortoise group. The ancestors of extant Chelonoidis species arrived in South America from Africa only after the opening of the Atlantic Ocean and dispersed from there to the Caribbean and the Galápagos Islands. Our results also suggest that the anoxic, thermally buffered environment of blue holes may enhance DNA preservation, and thus are opening a window for better understanding evolution and population history of extinct tropical species, which would likely still exist without human impact.

show abstract

FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads

Cited by 497 publications

References 29 publications

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression

Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize

Tropical ancient DNA reveals relationships of the extinct Bahamian giant tortoiseChelonoidis alburyorum

Contact Info

Product

Resources

About