Summary Thlaspi arvense (field pennycress) is being domesticated as a winter annual oilseed crop capable of improving ecosystems and intensifying agricultural productivity without increasing land use. It is a selfing diploid with a short life cycle and is amenable to genetic manipulations, making it an accessible field‐based model species for genetics and epigenetics. The availability of a high‐quality reference genome is vital for understanding pennycress physiology and for clarifying its evolutionary history within the Brassicaceae. Here, we present a chromosome‐level genome assembly of var. MN106‐Ref with improved gene annotation and use it to investigate gene structure differences between two accessions (MN108 and Spring32‐10) that are highly amenable to genetic transformation. We describe non‐coding RNAs, pseudogenes and transposable elements, and highlight tissue‐specific expression and methylation patterns. Resequencing of forty wild accessions provided insights into genome‐wide genetic variation, and QTL regions were identified for a seedling colour phenotype. Altogether, these data will serve as a tool for pennycress improvement in general and for translational research across the Brassicaceae.
Natural plant populations often harbour substantial heritable variation in DNA methylation. However, a thorough understanding of the genetic and environmental drivers of this epigenetic variation requires large-scale and high-resolution data, which currently exist only for a few model species. Here, we studied 207 lines of the annual weed Thlaspi arvense (field pennycress), collected across a large latitudinal gradient in Europe and propagated in a common environment. By screening for variation in DNA sequence and DNA methylation using whole-genome (bisulfite) sequencing, we found significant epigenetic population structure across Europe. Average levels of DNA methylation were strongly context-dependent, with highest DNA methylation in CG context, particularly in transposable elements and in intergenic regions. Residual DNA methylation variation within all contexts was associated with genetic variants, which often co-localized with annotated methylation machinery genes but also with new candidates. Variation in DNA methylation was also significantly associated with climate of origin, with methylation levels being higher in warmer regions and lower in more variable climates. Finally, we used variance decomposition to assess genetic versus environmental associations with differentially methylated regions (DMRs). We found that while genetic variation was generally the strongest predictor of DMRs, the strength of environmental associations increased from CG to CHG and CHH, with climate-of-origin as the strongest predictor in about one third of the CHH DMRs. In summary, our data show that natural epigenetic variation in Thlaspi arvense is significantly associated with both DNA sequence and environment of origin, and that the relative importance of the two factors strongly depends on the sequence context of DNA methylation. T. arvense is an emerging biofuel and winter cover crop; our results may hence be relevant for breeding efforts and agricultural practices in the context of rapidly changing environmental conditions.
Several reduced-representation bisulfite sequencing methods have been developed in recent years to determine cytosine methylation de novo in nonmodel species. Here, we present epiGBS2, a laboratory protocol based on epiGBS with a revised and userfriendly bioinformatics pipeline for a wide range of species with or without a reference genome. epiGBS2 is cost-and time-efficient and the computational workflow is designed in a user-friendly and reproducible manner. The library protocol allows a flexible choice of restriction enzymes and a double digest. The bioinformatics pipeline was integrated in the Snakemake workflow management system, which makes the pipeline easy to execute and modular, and parameter settings for important computational steps flexible. We implemented biSmark for alignment and methylation analysis and we preprocessed alignment files by double masking to enable single nucleotide polymorphism calling with FreebayeS (epiFreebayeS). The performance of several critical steps in epiGBS2 was evaluated against baseline data sets from Arabidopsis thaliana and great tit (Parus major), which confirmed its overall good performance. We provide a detailed description of the laboratory protocol and an extensive manual of the
Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.
Thlaspi arvense (field pennycress) is being domesticated as a winter annual oilseed crop capable of improving ecosystems and intensifying agricultural productivity without increasing land use. It is a selfing diploid with a short life cycle and is amenable to genetic manipulations, making it an accessible field-based model species for genetics and epigenetics. The availability of a high quality reference genome is vital for understanding pennycress physiology and for clarifying its evolutionary history within the Brassicaceae. Here, we present a chromosome-level genome assembly of var. MN106-Ref with improved gene annotation, and use it to investigate gene structure differences between two accessions (MN108 and Spring32-10) that are highly amenable to genetic transformation. We describe non-coding RNAs, pseudogenes, and transposable elements, and highlight tissue specific expression and methylation patterns. Resequencing of forty wild accessions provides insights into genome-wide genetic variation as well as QTL regions for flowering time and a seedling color phenotype. Altogether, these data will serve as a tool for pennycress improvement in general and for translational research across the Brassicaceae.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.