Arctium lappa has a long medicinal and edible history with great economic importance.Here, the first high-quality chromosome-level draft genome of A. lappa was presented by the Illumina and PacBio sequencing data. The assembled genome was approximately 1.79 Gb with a N50 contig size of 6.88 Mb. Approximately 1.70 Gb (95.4%) of the contig sequences were anchored onto 18 chromosomes using Hi-C data; the scaffold N50 was improved to be 91.64 Mb. Furthermore, we obtained 1.12 Gb (68.46%) of repetitive sequences and 32,771 protein-coding genes; 616 positively selected candidate genes were identified. Among candidate genes related to lignan biosynthesis, the following were found to be highly correlated with the accumulation of arctiin: 4-coumarate-CoA ligase (4CL), dirigent protein (DIR), and hydroxycinnamoyl transferase (HCT). Additionally, we compared the transcriptomes of A. lappa roots at three different developmental stages and identified 8,943 differentially expressed genes (DEGs) in these tissues. These data can be utilized to identify genes related to A. lappa quality or provide a basis for molecular identification and comparative genomics among related species.
Backguound: Mitochondrial genome sequence analysis is of great significance for understanding the evolution and genome structure of different plant species. Arctium lappa and A. tomentosum are distributed in China and frequently used as medicinal plants. People usually think A. tomentosum is an adulterant or substitute of A. lappa as a traditional Chinese Medicine (TCM). It is therefore critically important to identify the different species that are utilized in medicinal applications. This study aims to determine and compare their mitochondrial genomes, gene structure, and phylogenetic relationship. These results may provide additional insights into development of genetic research. Results: We determined the complete sequences of the mitochondrial genomes of A. lappa and A. tomentosum for the first time. The mitochondrial genomes of A. lappa and A. tomentosum were assembled into 2 single circular molecules of 312598 bp and 312609 bp, respectively. A total of 131 and 130 genes were annotated in two plants. 50 pairs of large repeat sequences were detected in A. lappa and A. tomentosum. The number of simple sequence repeats (SSRs) in both species was 192 while the total length of SSR was 2491 bp for A. lappa and 2489 bp for A. tomentosum. Only 51 single nucleotide polymorphisms (SNPs) and 3 insertion-deletions (InDels) were detected between the two plants. The two mitochondrial genome structures were highly similar and highly collinear. Both of the chloroplast genomes and mitochondrial genomes of the two plants had the phenomenon of gene exchange and transfer. Core genes and specific genes were analyzed for A. lappa and A. tomentosum and three closely related Asteraceae species, the specific gene of A. lappa was orf115a. In addition, a phylogenetic tree of the mitochondrial genomes were constructed, which laced the two Arctium species into one branch within Asteraceae. Conclusions: We identified and analyzed the mitochondrial genome features of two species of Arctium in China with implications for species identification and phylogenetic analysis. The mitochondrial genomes of A. lappa and A. tomentosum were very similar in size and structure. The ORF genes of the two were different, which could provide a theoretical basis for the development of molecular markers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.