Background Plant mitochondrial genomes (mitogenomes) can be structurally complex while their size can vary from ~ 222 Kbp in Brassica napus to 11.3 Mbp in Silene conica. To date, in comparison with the number of plant species, only a few plant mitogenomes have been sequenced and released, particularly for conifers (the Pinaceae family). Conifers cover an ancient group of land plants that includes about 600 species, and which are of great ecological and economical value. Among them, Siberian larch (Larix sibirica Ledeb.) represents one of the keystone species in Siberian boreal forests. Yet, despite its importance for evolutionary and population studies, the mitogenome of Siberian larch has not yet been assembled and studied. Results Two sources of DNA sequences were used to search for mitochondrial DNA (mtDNA) sequences: mtDNA enriched samples and nucleotide reads generated in the de novo whole genome sequencing project, respectively. The assembly of the Siberian larch mitogenome contained nine contigs, with the shortest and the largest contigs being 24,767 bp and 4,008,762 bp, respectively. The total size of the genome was estimated at 11.7 Mbp. In total, 40 protein-coding, 34 tRNA, and 3 rRNA genes and numerous repetitive elements (REs) were annotated in this mitogenome. In total, 864 C-to-U RNA editing sites were found for 38 out of 40 protein-coding genes. The immense size of this genome, currently the largest reported, can be partly explained by variable numbers of mobile genetic elements, and introns, but unlikely by plasmid-related sequences. We found few plasmid-like insertions representing only 0.11% of the entire Siberian larch mitogenome. Conclusions Our study showed that the size of the Siberian larch mitogenome is much larger than in other so far studied Gymnosperms, and in the same range as for the annual flowering plant Silene conica (11.3 Mbp). Similar to other species, the Siberian larch mitogenome contains relatively few genes, and despite its huge size, the repeated and low complexity regions cover only 14.46% of the mitogenome sequence.
BackgroundThe main objectives of this study were sequencing, assembling, and annotation of chloroplast genome of one of the main Siberian boreal forest tree conifer species Siberian larch (Larix sibirica Ledeb.) and detection of polymorphic genetic markers – microsatellite loci or simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs).ResultsWe used the data of the whole genome sequencing of three Siberian larch trees from different regions - the Urals, Krasnoyarsk, and Khakassia, respectively. Sequence reads were obtained using the Illumina HiSeq2000 in the Laboratory of Forest Genomics at the Genome Research and Education Center of the Siberian Federal University. The assembling was done using the Bowtie2 mapping program and the SPAdes genomic assembler. The genome annotation was performed using the RAST service. We used the GMATo program for the SSRs search, and the Bowtie2 and UGENE programs for the SNPs detection. Length of the assembled chloroplast genome was 122,561 bp, which is similar to 122,474 bp in the closely related European larch (Larix decidua Mill.). As a result of annotation and comparison of the data with the existing data available only for three larch species - L. decidua, L. potaninii var. chinensis (complete genome 122,492 bp), and L. occidentalis (partial genome of 119,680 bp), we identified 110 genes, 34 of which represented tRNA, 4 rRNA, and 72 protein-coding genes. In total, 13 SNPs were detected; two of them were in the tRNA-Arg and Cell division protein FtsH genes, respectively. In addition, 23 SSR loci were identified.ConclusionsThe complete chloroplast genome sequence was obtained for Siberian larch for the first time. The reference complete chloroplast genomes, such as one described here, would greatly help in the chloroplast resequencing and search for additional genetic markers using population samples. The results of this research will be useful for further phylogenetic and gene flow studies in conifers.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2571-x) contains supplementary material, which is available to authorized users.
The availability and intensity of sunlight are among the major factors of growth, development and metabolism in plants. However, excessive illumination disrupts the electronic balance of photosystems and leads to the accumulation of reactive oxygen species in chloroplasts, further mediating several regulatory mechanisms at the subcellular, genetic, and molecular levels. We carried out a comprehensive bioinformatic analysis that aimed to identify genetic systems and candidate transcription factors involved in the response to high light stress in Arabidopsis thaliana L. using resources GEO NCBI, string-db, ShinyGO, STREME, and Tomtom, as well as programs metaRE, CisCross, and Cytoscape. Through the meta-analysis of five transcriptomic experiments, we selected a set of 1151 differentially expressed genes, including 453 genes that compose the gene network. Ten significantly enriched regulatory motifs for TFs families ZF-HD, HB, C2H2, NAC, BZR, and ARID were found in the promoter regions of differentially expressed genes. In addition, we predicted families of transcription factors associated with the duration of exposure (RAV, HSF), intensity of high light treatment (MYB, REM), and the direction of gene expression change (HSF, S1Fa-like). We predicted genetic components systems involved in a high light response and their expression changes, potential transcriptional regulators, and associated processes.
The identification of promoters is an essential step in the genome annotation process, providing a framework for gene regulatory networks and their role in transcription regulation. Despite considerable advances in the high-throughput determination of transcription start sites (TSSs) and transcription factor binding sites (TFBSs), experimental methods are still time-consuming and expensive. Instead, several computational approaches have been developed to provide fast and reliable means for predicting the location of TSSs and regulatory motifs on a genome-wide scale. Numerous studies have been carried out on the regulatory elements of mammalian genomes, but plant promoters, especially in gymnosperms, have been left out of the limelight and, therefore, have been poorly investigated. The aim of this study was to enhance and expand the existing genome annotations using computational approaches for genome-wide prediction of TSSs in the four conifer species: loblolly pine, white spruce, Norway spruce, and Siberian larch. Our pipeline will be useful for TSS predictions in other genomes, especially for draft assemblies, where reliable TSS predictions are not usually available. We also explored some of the features of the nucleotide composition of the predicted promoters and compared the GC properties of conifer genes with model monocot and dicot plants. Here, we demonstrate that even incomplete genome assemblies and partial annotations can be a reliable starting point for TSS annotation. The results of the TSS prediction in four conifer species have been deposited in the Persephone genome browser, which allows smooth visualization and is optimized for large data sets. This work provides the initial basis for future experimental validation and the study of the regulatory regions to understand gene regulation in gymnosperms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.