Over the past 5 years, large-scale sequencing has been revolutionized by the development of several so-called next-generation sequencing (NGS) technologies. These have drastically increased the number of bases obtained per sequencing run while at the same time decreasing the costs per base. Compared to Sanger sequencing, NGS technologies yield shorter read lengths; however, despite this drawback, they have greatly facilitated genome sequencing, first for prokaryotic genomes and within the last year also for eukaryotic ones. This advance was possible due to a concomitant development of software that allows the de novo assembly of draft genomes from large numbers of short reads. In addition, NGS can be used for metagenomics studies as well as for the detection of sequence variations within individual genomes, e.g., single-nucleotide polymorphisms (SNPs), insertions/deletions (indels), or structural variants. Furthermore, NGS technologies have quickly been adopted for other high-throughput studies that were previously performed mostly by hybridization-based methods like microarrays. This includes the use of NGS for transcriptomics (RNA-seq) or the genome-wide analysis of DNA/protein interactions (ChIP-seq). This review provides an overview of NGS technologies that are currently available and the bioinformatics analyses that are necessary to obtain information from the flood of sequencing data as well as applications of NGS to address biological questions in eukaryotic microorganisms.The first report on the sequence of 10 consecutive bases in a DNA strand was published in 1968 (117), but methods to reliably obtain longer DNA sequences, namely, Sanger and Maxam-Gilbert sequencing, were not available until 1977 (71, 96). Of these, only Sanger sequencing underwent improvements that led to automation and therefore, for the next 30 years, large-scale sequencing projects, e.g., whole-genome sequencing for various species, relied on this technology (41). However, despite (or indeed because of) much progress in the area of genome sequencing, it became clear that even more information was to be gained not only from sequencing one genome per species but rather from sequencing and comparing the genomes of different individuals or strains/lines from the same species. This would enable a better grasp of genetic diversity and, in the case of humans, allow "personalized medicine" approaches. To make this feasible, novel techniques were needed that overcame current limitations of Sanger sequencing with respect to throughput and costs (98), and in the last decade, a number of different methods were developed that not only have revolutionized the field of genome sequencing but also can be applied to other biological questions not previously addressed by sequencing-based approaches. This review provides an overview of these so-called second-generation or next-generation sequencing (NGS) technologies and their applications with a special focus on addressing questions relevant to the biology of eukaryotic microorganisms.
NEXT-GENERATIO...