Massively parallel DNA sequencing offers many benefits, but major inhibitory cost factors include: (1) start-up (i.e., purchasing initial reagents and equipment); (2) buy-in (i.e., getting the smallest possible amount of data from a run); and (3) sample preparation. Reducing sample preparation costs is commonly addressed, but start-up and buy-in costs are rarely addressed. We present dual-indexing systems to address all three of these issues. By breaking the library construction process into universal, re-usable, combinatorial components, we reduce all costs, while increasing the number of samples and the variety of library types that can be combined within runs. We accomplish this by extending the Illumina TruSeq dual-indexing approach to 768 (384 + 384) indexed primers that produce 384 unique dual-indexes or 147,456 (384 × 384) unique combinations. We maintain eight nucleotide indexes, with many that are compatible with Illumina index sequences. We synthesized these indexing primers, purifying them with only standard desalting and placing small aliquots in replicate plates. In qPCR validation tests, 206 of 208 primers tested passed (99% success). We then created hundreds of libraries in various scenarios. Our approach reduces start-up and per-sample costs by requiring only one universal adapter that works with indexed PCR primers to uniquely identify samples. Our approach reduces buy-in costs because: (1) relatively few oligonucleotides are needed to produce a large number of indexed libraries; and (2) the large number of possible primers allows researchers to use unique primer sets for different projects, which facilitates pooling of samples during sequencing. Our libraries make use of standard Illumina sequencing primers and index sequence length and are demultiplexed with standard Illumina software, thereby minimizing customization headaches. In subsequent Adapterama papers, we use these same primers with different adapter stubs to construct amplicon and restriction-site associated DNA libraries, but their use can be expanded to any type of library sequenced on Illumina platforms.
Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction-site-associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96-384 Wisteria plants. Our results demonstrate that our RADcap method: (i) methodologically reduces (to<5%) and allows computational removal of PCR duplicate reads from data, (ii) achieves 80-90% reads on target in 11 of 12 enrichments, (iii) returns consistent coverage (≥4×) across >90% of individuals at up to 99.8% of the targeted loci, (iv) produces consistently high occupancy matrices of genotypes across hundreds of individuals and (v) costs significantly less than current approaches.
47Next-generation DNA sequencing (NGS) offers many benefits, but major factors limiting NGS 48 include reducing the time and costs associated with: 1) start-up (i.e., doing NGS for the first 49 time), 2) buy-in (i.e., getting any data from a run), and 3) sample preparation. Although many 50 researchers have focused on reducing sample preparation costs, few have addressed the first two 51 problems. Here, we present iTru and iNext, dual-indexing systems for Illumina libraries that 52 help address all three of these issues. By breaking the library construction process into re-usable, 53 combinatorial components, we achieve low start-up, buy-in, and per-sample costs, while 54 simultaneously increasing the number of samples that can be combined within a single run. We 55 accomplish this by extending the Illumina TruSeq dual-indexing approach from 20 (8+12) 56 indexed adapters that produce 96 (8x12) unique combinations to 579 (192+387) indexed primers 57 that produce 74,304 (192x387) unique combinations. We synthesized 208 of these indexed 58All rights reserved. No reuse allowed without permission.(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.The copyright holder for this preprint . http://dx.doi.org/10.1101/049114 doi: bioRxiv preprint first posted online Jun. 15, 2016; 3 primers for validation, and 206 of them passed our validation criteria (99% success). We also 59 used the indexed primers to create hundreds of libraries in a variety of scenarios. Our approach 60 reduces start-up and per-sample costs by requiring only one universal adapter which works with 61 indexed PCR primers to uniquely identify samples. Our approach reduces buy-in costs because: 62 1) relatively few oligonucleotides are needed to produce a large number of indexed libraries; and 632) the large number of possible primers allows researchers to use unique primer sets for different 64 projects, which facilitates pooling of samples during sequencing. Although the methods we 65
Molecular ecologists frequently use genome reduction strategies that rely upon restriction enzyme digestion of genomic DNA to sample consistent portions of the genome from many individuals (e.g., RADseq, GBS). However, researchers often find the existing methods expensive to initiate and/or difficult to implement consistently, especially because it is difficult to multiplex sufficient numbers of samples to fill entire sequencing lanes. Here, we introduce a low-cost and highly robust approach for the construction of dual-digest RADseq libraries that build on adapters and primers designed in Adapterama I. Major features of our method include: (1) minimizing the number of processing steps; (2) focusing on a single strand of sample DNA for library construction, allowing the use of a non-phosphorylated adapter on one end; (3) ligating adapters in the presence of active restriction enzymes, thereby reducing chimeras; (4) including an optional third restriction enzyme to cut apart adapter-dimers formed by the phosphorylated adapter, thus increasing the efficiency of adapter ligation to sample DNA, which is particularly effective when only low quantity/quality DNA samples are available; (5) interchangeable adapter designs; (6) incorporating variable-length internal indexes within the adapters to increase the scope of sample indexing, facilitate pooling, and increase sequence diversity; (7) maintaining compatibility with universal dual-indexed primers and thus, Illumina sequencing reagents and libraries; and, (8) easy modification for the identification of PCR duplicates. We present eight adapter designs that work with 72 restriction enzyme combinations. We demonstrate the efficiency of our approach by comparing it with existing methods, and we validate its utility through the discovery of many variable loci in a variety of non-model organisms. Our 2RAD/3RAD method is easy to perform, has low startup costs, has increased utility with low-concentration input DNA, and produces libraries that can be highly-multiplexed and pooled with other Illumina libraries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.