BackgroundThe process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.ResultsIn Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.ConclusionsMany current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
A finished clone-based assembly of the mouse genome reveals extensive recent sequence duplication during recent evolution and rodent-specific expansion of certain gene families. Newly assembled duplications contain protein-coding genes that are mostly involved in reproductive function.
Single molecule approaches offer the promise of large, exquisitely miniature ensembles for the generation of equally large data sets. Although microfluidic devices have previously been designed to manipulate single DNA molecules, many of the functionalities they embody are not applicable to very large DNA molecules, normally extracted from cells. Importantly, such microfluidic devices must work within an integrated system to enable high-throughput biological or biochemical analysis-a key measure of any device aimed at the chemical/biological interface and required if large data sets are to be created for subsequent analysis. The challenge here was to design an integrated microfluidic device to control the deposition or elongation of large DNA molecules (up to millimeters in length), which would serve as a general platform for biological/biochemical analysis to function within an integrated system that included massively parallel data collection and analysis. The approach we took was to use replica molding to construct silastic devices to consistently deposit oriented, elongated DNA molecules onto charged surfaces, creating massive single molecule arrays, which we analyzed for both physical and biochemical insights within an integrated environment that created large data sets. The overall efficacy of this approach was demonstrated by the restriction enzyme mapping and identification of single human genomic DNA molecules.
Recombination rate in Drosophila species shapes the impact of selection in the genome and is positively correlated with nucleotide diversity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.