SignificanceA high-quality genome assembly of Camellia sinensis var. sinensis facilitates genomic, transcriptomic, and metabolomic analyses of the quality traits that make tea one of the world’s most-consumed beverages. The specific gene family members critical for biosynthesis of key tea metabolites, monomeric galloylated catechins and theanine, are indicated and found to have evolved specifically for these functions in the tea plant lineage. Two whole-genome duplications, critical to gene family evolution for these two metabolites, are identified and dated, but are shown to account for less amplification than subsequent paralogous duplications. These studies lay the foundation for future research to understand and utilize the genes that determine tea quality and its diversity within tea germplasm.
BackgroundTea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes.ResultsUsing high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR).ConclusionsAn extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.
Tea plant is an important economic crop, which is used to produce the world's oldest and most widely consumed tea beverages. Here, we present a high-quality reference genome assembly of the tea plant (Camellia sinensis var. sinensis) consisting of 15 pseudo-chromosomes. LTR retrotransposons (LTR-RTs) account for 70.38% of the genome, and we present evidence that LTR-RTs play critical roles in genome size expansion and the transcriptional diversification of tea plant genes through preferential insertion in promoter regions and introns. Genes, particularly those coding for terpene biosynthesis proteins, associated with tea aroma and stress resistance were significantly amplified through recent tandem duplications and exist as gene clusters in tea plant genome. Phylogenetic analysis of the sequences of 81 tea plant accessions with diverse origins revealed three well-differentiated tea plant populations, supporting the proposition for the southwest origin of the Chinese cultivated tea plant and its later spread to western Asia through introduction. Domestication and modern breeding left significant signatures on hundreds of genes in the tea plant genome, particularly those associated with tea quality and stress resistance. The genomic sequences of the reported reference and resequenced tea plant accessions provide valuable resources for future functional genomics study and molecular breeding of improved cultivars of tea plants.
Summary Tea is the world's widely consumed nonalcohol beverage with essential economic and health benefits. Confronted with the increasing large‐scale omics‐data set particularly the genome sequence released in tea plant, the construction of a comprehensive knowledgebase is urgently needed to facilitate the utilization of these data sets towards molecular breeding. We hereby present the first integrative and specially designed web‐accessible database, Tea Plant Information Archive (TPIA; http://tpia.teaplant.org). The current release of TPIA employs the comprehensively annotated tea plant genome as framework and incorporates with abundant well‐organized transcriptomes, gene expressions (across species, tissues and stresses), orthologs and characteristic metabolites determining tea quality. It also hosts massive transcription factors, polymorphic simple sequence repeats, single nucleotide polymorphisms, correlations, manually curated functional genes and globally collected germplasm information. A variety of versatile analytic tools (e.g. JBrowse, blast, enrichment analysis, etc.) are established helping users to perform further comparative, evolutionary and functional analysis. We show a case application of TPIA that provides novel and interesting insights into the phytochemical content variation of section Thea of genus Camellia under a well‐resolved phylogenetic framework. The constructed knowledgebase of tea plant will serve as a central gateway for global tea community to better understand the tea plant biology that largely benefits the whole tea industry.
BackgroundTea plants (Camellia sinensis) are used to produce one of the most important beverages worldwide. The nutritional value and healthful properties of tea are closely related to the large amounts of three major characteristic constituents including polyphenols (mainly catechins), theanine and caffeine. Although oil tea (Camellia oleifera) belongs to the genus Camellia, this plant lacks these three characteristic constituents. Comparative analysis of tea and oil tea via RNA-Seq would help uncover the genetic components underlying the biosynthesis of characteristic metabolites in tea.ResultsWe found that 3,787 and 3,359 bud genes, as well as 4,042 and 3,302 leaf genes, were up-regulated in tea and oil tea, respectively. High-performance liquid chromatography (HPLC) analysis revealed high levels of all types of catechins, theanine and caffeine in tea compared to those in oil tea. Activation of the genes involved in the biosynthesis of these characteristic compounds was detected by RNA-Seq analysis. In particular, genes encoding enzymes involved in flavonoid, theanine and caffeine pathways exhibited considerably different expression levels in tea compared to oil tea, which were also confirmed by quantitative RT-PCR (qRT-PCR).ConclusionWe assembled 81,826 and 78,863 unigenes for tea and oil tea, respectively, based on their differences at the transcriptomic level. A potential connection was observed between gene expression and content variation for catechins, theanine and caffeine in tea and oil tea. The results demonstrated that the metabolism was activated during the accumulation of characteristic metabolites in tea, which were present at low levels in oil tea. From the molecular biological perspective, our comparison of the transcriptomes and related metabolites revealed differential regulatory mechanisms underlying secondary metabolic pathways in tea versus oil tea.Electronic supplementary materialThe online version of this article (doi:10.1186/s12870-015-0574-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.