Background Chloroplasts are intracellular organelles that enable plants to conduct photosynthesis. They arose through the symbiotic integration of a prokaryotic cell into an eukaryotic host cell and still contain their own genomes with distinct genomic information. Plastid genomes accommodate essential genes and are regularly utilized in biotechnology or phylogenetics. Different assemblers that are able to assess the plastid genome have been developed. These assemblers often use data of whole genome sequencing experiments, which usually contain reads from the complete chloroplast genome. Results The performance of different assembly tools has never been systematically compared. Here, we present a benchmark of seven chloroplast assembly tools, capable of succeeding in more than 60% of known real data sets. Our results show significant differences between the tested assemblers in terms of generating whole chloroplast genome sequences and computational requirements. The examination of 105 data sets from species with unknown plastid genomes leads to the assembly of 20 novel chloroplast genomes. Conclusions We create docker images for each tested tool that are freely available for the scientific community and ensure reproducibility of the analyses. These containers allow the analysis and screening of data sets for chloroplast genomes using standard computational infrastructure. Thus, large scale screening for chloroplasts within genomic sequencing data is feasible.
Chloroplasts are photosynthetic organelles in plant cells and contain their own genomic information. That genome can be utilized in different scientific fields like phylogenetics or biotechnology. Thus, different assemblers have been developed specialized in chloroplast assemblies. Those assemblers often use the output of whole genome sequencing experiments as input. Such sequencing data usually contain the complete chloroplast genome information, even if the sequencing aims for the core genome. Different assembly tools have never been systematically compared. Here we present a benchmark of seven chloroplast assembly tools, capable to succeed in more than 60 % of real data sets. Our results show significant differences between the tested assemblers in terms of generating whole chloroplast genome sequences and computational requirements. Moreover, we suggest further development to improve user experience and success rate. In terms of reproducibility, we created docker images for each tested tool, which are available for the scientific community. Following the presented guidelines, users are able to analyze and screen data sets for chloroplast genomes using only standard computer infrastructure. Thus large scale screening for chloroplasts as hidden treasures within genomic sequencing data is feasible.
Genome-wide association studies (GWAS) are integral for studying genotype-phenotype relationships and gaining a deeper understanding of the genetic architecture underlying trait variation. A plethora of genetic associations between distinct loci and various traits have been successfully discovered and published for the model plant Arabidopsis thaliana. This success and the free availability of full genomes and phenotypic data for more than 1,000 different natural inbred lines led to the development of several data repositories. AraPheno (https://arapheno.1001genomes.org) serves as a central repository of population-scale phenotypes in A. thaliana, while the AraGWAS Catalog (https://aragwas.1001genomes.org) provides a publicly available, manually curated and standardized collection of marker-trait associations for all available phenotypes from AraPheno. In this major update, we introduce the next generation of both platforms, including new data, features and tools. We included novel results on associations between knockout-mutations and all AraPheno traits. Furthermore, AraPheno has been extended to display RNA-Seq data for hundreds of accessions, providing expression information for over 28 000 genes for these accessions. All data, including the imputed genotype matrix used for GWAS, are easily downloadable via the respective databases.
Genomic selection uses whole-genome marker models to predict phenotypes or genetic values for complex traits. Some of these models fit interaction terms between markers, and are therefore called epistatic. The biological interpretation of the corresponding fitted effects is not straightforward and there is the threat of overinterpreting their functional meaning. Here we show that the predictive ability of epistatic models relative to additive models can change with the density of the marker panel. In more detail, we show that for publicly available Arabidopsis and rice datasets, an initial superiority of epistatic models over additive models, which can be observed at a lower marker density, vanishes when the number of markers increases. We relate these observations to earlier results reported in the context of association studies which showed that detecting statistical epistatic effects may not only be related to interactions in the underlying genetic architecture, but also to incomplete linkage disequilibrium at low marker density ("Phantom Epistasis"). Finally, we illustrate in a simulation study that due to phantom epistasis, epistatic models may also predict the genetic value of an underlying purely additive genetic architecture better than additive models, when the marker density is low. Our observations can encourage the use of genomic epistatic models with low density panels, and discourage their biological over-interpretation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.