Oleaginous microalgae are promising feedstock for biofuels, yet the genetic diversity, origin and evolution of oleaginous traits remain largely unknown. Here we present a detailed phylogenomic analysis of five oleaginous Nannochloropsis species (a total of six strains) and one time-series transcriptome dataset for triacylglycerol (TAG) synthesis on one representative strain. Despite small genome sizes, high coding potential and relative paucity of mobile elements, the genomes feature small cores of ca. 2,700 protein-coding genes and a large pan-genome of >38,000 genes. The six genomes share key oleaginous traits, such as the enrichment of selected lipid biosynthesis genes and certain glycoside hydrolase genes that potentially shift carbon flux from chrysolaminaran to TAG synthesis. The eleven type II diacylglycerol acyltransferase genes (DGAT-2) in every strain, each expressed during TAG synthesis, likely originated from three ancient genomes, including the secondary endosymbiosis host and the engulfed green and red algae. Horizontal gene transfers were inferred in most lipid synthesis nodes with expanded gene doses and many glycoside hydrolase genes. Thus multiple genome pooling and horizontal genetic exchange, together with selective inheritance of lipid synthesis genes and species-specific gene loss, have led to the enormous genetic apparatus for oleaginousness and the wide genomic divergence among present-day Nannochloropsis. These findings have important implications in the screening and genetic engineering of microalgae for biofuels.
The ability to rapidly switch the intracellular energy storage form from starch to lipids is an advantageous trait for microalgae feedstock. To probe this mechanism, we sequenced the 56.8-Mbp genome of Chlorella pyrenoidosa FACHB-9, an industrial production strain for protein, starch, and lipids. The genome exhibits positive selection and gene family expansion in lipid and carbohydrate metabolism and genes related to cell cycle and stress response. Moreover, 10 lipid metabolism genes might be originated from bacteria via horizontal gene transfer. Transcriptomic dynamics tracked via messenger RNA sequencing over six time points during metabolic switch from starch-rich heterotrophy to lipid-rich photoautotrophy revealed that under heterotrophy, genes most strongly expressed were from the tricarboxylic acid cycle, respiratory chain, oxidative phosphorylation, gluconeogenesis, glyoxylate cycle, and amino acid metabolisms, whereas those most down-regulated were from fatty acid and oxidative pentose phosphate metabolism. The shift from heterotrophy into photoautotrophy highlights up-regulation of genes from carbon fixation, photosynthesis, fatty acid biosynthesis, the oxidative pentose phosphate pathway, and starch catabolism, which resulted in a marked redirection of metabolism, where the primary carbon source of glycine is no longer supplied to cell building blocks by the tricarboxylic acid cycle and gluconeogenesis, whereas carbon skeletons from photosynthesis and starch degradation may be directly channeled into fatty acid and protein biosynthesis. By establishing the first genetic transformation in industrial oleaginous C. pyrenoidosa, we further showed that overexpression of an NAD(H) kinase from Arabidopsis (Arabidopsis thaliana) increased cellular lipid content by 110.4%, yet without reducing growth rate. These findings provide a foundation for exploiting the metabolic switch in microalgae for improved photosynthetic production of food and fuels.
Next-generation sequencing (NGS) technologies have been widely used in life sciences. However, several kinds of sequencing artifacts, including low-quality reads and contaminating reads, were found to be quite common in raw sequencing data, which compromise downstream analysis. Therefore, quality control (QC) is essential for raw NGS data. However, although a few NGS data quality control tools are publicly available, there are two limitations: First, the processing speed could not cope with the rapid increase of large data volume. Second, with respect to removing the contaminating reads, none of them could identify contaminating sources de novo, and they rely heavily on prior information of the contaminating species, which is usually not available in advance. Here we report QC-Chain, a fast, accurate and holistic NGS data quality-control method. The tool synergeticly comprised of user-friendly tools for (1) quality assessment and trimming of raw reads using Parallel-QC, a fast read processing tool; (2) identification, quantification and filtration of unknown contamination to get high-quality clean reads. It was optimized based on parallel computation, so the processing speed is significantly higher than other QC methods. Experiments on simulated and real NGS data have shown that reads with low sequencing quality could be identified and filtered. Possible contaminating sources could be identified and quantified de novo, accurately and quickly. Comparison between raw reads and processed reads also showed that subsequent analyses (genome assembly, gene prediction, gene annotation, etc.) results based on processed reads improved significantly in completeness and accuracy. As regard to processing speed, QC-Chain achieves 7–8 time speed-up based on parallel computation as compared to traditional methods. Therefore, QC-Chain is a fast and useful quality control tool for read quality process and de novo contamination filtration of NGS reads, which could significantly facilitate downstream analysis. QC-Chain is publicly available at: http://www.computationalbioenergy.org/qc-chain.html.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.