We report here on the genome sequence of Pasteurella multocida Razi 0002 of avian origin, isolated in Iran. The genome has a size of 2,289,036 bp, a GC content of 40.3%, and is predicted to contain 2,079 coding sequences.
Large-scale population based analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short read whole genome sequencing. However, standard short-read approaches, used primarily due to accuracy, throughput and costs, fail to give a complete picture of a genome. They struggle to identify large, balanced structural events, cannot access repetitive regions of the genome and fail to resolve the human genome into its two haplotypes. Here we describe an approach that retains long range information while harnessing the advantages of short reads. Starting from only~ ng of DNA, we produce barcoded short read libraries. The use of novel informatic approaches allows for the barcoded short reads to be associated with the long molecules of origin producing a novel datatype known as 'Linked-Reads'. This approach allows for simultaneous detection of small and large variants from a single Linked-Read library. We have previously demonstrated the utility of whole genome Linked-Reads (lrWGS) for performing diploid, de novo assembly of individual genomes (Weisenfeld et al. ). In this manuscript, weshow the advantages of Linked-Reads over standard short read approaches for reference based analysis. We demonstrate the ability of Linked-Reads to reconstruct megabase scale haplotypes and to recover parts of the genome that are typically inaccessible to short reads, including phenotypically important genes such as STRC, SMN and SMN . We demonstrate the ability of both lrWGS and Linked-Read Whole Exome Sequencing (lrWES) to identify complex structural variations, including balanced events, single exon deletions, and single exon duplications. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC, SMN1, and SMN2. Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
Traditional 2nd generation sequencing strategies have significantly reduced the cost of sequencing the human genome and provide flexibility to query specific gene panels, the whole exome, or the whole genome. However, these methodologies are based on short reads which limit their ability to phase/haplotype over long genomic distances, accurately map reads between highly homologous regions (e.g., genes vs. pseudogenes), and robustly detect particular types of structural variants (e.g., inversions and translocations). Advances in microfluidics technology and precision reagent delivery allow long-range information to be rescued and preserved through the use of the 10x Genomics Chromium platform. Each input DNA fragment (~40-200kb) is partitioned into a gel-bead in emulsion (GEM), and subsequent biochemistry generates mini-libraries of NGS-ready molecules tagged with a barcode unique for each GEM. Thus, long-range context is achieved by linking short reads sharing the same barcode, and contiguity is established because they were derived from the same input fragment. Importantly, the barcoded mini-libraries are compatible with short-read sequencers and can be implemented as an add-on to existing sequencing infrastructures. Here we describe and demonstrate how the user-friendly and uniquely-tuned liquid handling capabilities of the PerkinElmer Sciclone® NGSx Workstation interface with the 10x Genomics chip to successfully automate the Chromium Genome workflow. We show the preservation of intact genomic DNA during automated library preparation and demonstrate that these libraries have comparable quality to those generated by manual preparation. Following Chromium partitioning, mini-library generation, and pooling of all the GEM mini-libraries, samples were processed again on the Sciclone using a previously established automated workflow for exome/panel target capture using Agilent SureSelectTM baits. This end-to-end automated workflow was used to generate Linked-Read whole exome data on samples with unresolved structural rearrangements and targeted Linked Reads in a Lynch syndrome gene, PMS2. Linked Reads enable us to 1) fine-map structural rearrangements detected by karyotyping and 2) resolve variants in PMS2 versus those in its homologous pseudogene, PMS2CL, without invoking non-NGS methods such as MLPA or long-range PCR. The benefits of automation are essential to the scale-up of high-throughput projects by removing manual variability and increasing efficiency. This partnership offers a unique workflow solution that enables exome and panel-based Linked-Read sequencing at scale. For Research Use Only. Not for use in diagnostic procedures. Citation Format: Renata Pellegrino, Michael Benway, Paulina Kocjan, Andrew Price, Charlly Kao, Brian A. Gerwe, Adrian Fehr, Fernanda Mafra, James Garifallou, Hakon Hakonarson. High-throughput automation of the 10x Genomics® Chromium™ workflow for linked-read whole exome sequencing and a targeted lynch syndrome panel [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 5353. doi:10.1158/1538-7445.AM2017-5353
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.