We carried out a simulation study to compare the efficiency of three alternative programs (dfdist, detseld and bayescan) to detect loci under directional selection from genome‐wide scans using dominant markers. We also evaluated the efficiency of correcting for multiple testing those methods that use a classical probability approach. Under a wide range of scenarios, we conclude that bayescan appears to be more efficient than the other methods, detecting a usually high percentage of true selective loci as well as less than 1% of outliers (false positives) under a fully neutral model. In addition, the percentage of outliers detected by this software is always correlated with the true percentage of selective loci in the genome. Our results show, nevertheless, that false positives are common even with a combination of methods and multitest correction, suggesting that conclusions obtained from this approach should be taken with extreme caution.
The pan-genome of a species is defined as the union of all the genes and non-coding sequences found in all its individuals. However, constructing a pan-genome for plants with large genomes is daunting both in sequencing cost and the scale of the required computational analysis. A more affordable alternative is to focus on the genic repertoire by using transcriptomic data. Here, the software GET_HOMOLOGUES-EST was benchmarked with genomic and RNA-seq data of 19 Arabidopsis thaliana ecotypes and then applied to the analysis of transcripts from 16 Hordeum vulgare genotypes. The goal was to sample their pan-genomes and classify sequences as core, if detected in all accessions, or accessory, when absent in some of them. The resulting sequence clusters were used to simulate pan-genome growth, and to compile Average Nucleotide Identity matrices that summarize intra-species variation. Although transcripts were found to under-estimate pan-genome size by at least 10%, we concluded that clusters of expressed sequences can recapitulate phylogeny and reproduce two properties observed in A. thaliana gene models: accessory loci show lower expression and higher non-synonymous substitution rates than core genes. Finally, accessory sequences were observed to preferentially encode transposon components in both species, plus disease resistance genes in cultivated barleys, and a variety of protein domains from other families that appear frequently associated with presence/absence variation in the literature. These results demonstrate that pan-genome analyses are useful to explore germplasm diversity.
Drought causes important losses in crop production every season. Improvement for drought tolerance could take advantage of the diversity held in germplasm collections, much of which has not been incorporated yet into modern breeding. Spanish landraces constitute a promising resource for barley breeding, as they were widely grown until last century and still show good yielding ability under stress. Here, we study the transcriptome expression landscape in two genotypes, an outstanding Spanish landrace-derived inbred line (SBCC073) and a modern cultivar (Scarlett). Gene expression of adult plants after prolonged stresses, either drought or drought combined with heat, was monitored. Transcriptome of mature leaves presented little changes under severe drought, whereas abundant gene expression changes were observed under combined mild drought and heat. Developing inflorescences of SBCC073 exhibited mostly unaltered gene expression, whereas numerous changes were found in the same tissues for Scarlett. Genotypic differences in physiological traits and gene expression patterns confirmed the different behavior of landrace SBCC073 and cultivar Scarlett under abiotic stress, suggesting that they responded to stress following different strategies. A comparison with related studies in barley, addressing gene expression responses to drought, revealed common biological processes, but moderate agreement regarding individual differentially expressed transcripts. Special emphasis was put in the search of co-expressed genes and underlying common regulatory motifs. Overall, 11 transcription factors were identified, and one of them matched cis-regulatory motifs discovered upstream of co-expressed genes involved in those responses.
Using in silico amplified fragment length polymorphism (AFLP) fingerprints, we explore the relationship between sequence similarity and phylogeny accuracy to test when, in terms of genetic divergence, the quality of AFLP data becomes too low to be informative for a reliable phylogenetic reconstruction. We generated DNA sequences with known phylogenies using balanced and unbalanced trees with recent, uniform and ancient radiations, and average branch lengths (from the most internal node to the tip) ranging from 0.02 to 0.4 substitutions per site. The resulting sequences were used to emulate the AFLP procedure. Trees were estimated by maximum parsimony (MP), neighbor-joining (NJ), and minimum evolution (ME) methods from both DNA sequences and virtual AFLP fingerprints. The estimated trees were compared with the reference trees using a score that measures overall differences in both topology and relative branch length. As expected, the accuracy of AFLP-based phylogenies decreased dramatically in the more divergent data sets. Above a divergence of approximately 0.05, AFLP-based phylogenies were largely inaccurate irrespective of the distinct topology, radiation model, or phylogenetic method used. This value represents an upper bound of expected tree accuracy for data sets with a simple divergence history; AFLP data sets with a similar divergence but with unbalanced topologies and short ancestral branches produced much less accurate trees. The lack of homology of AFLP bands quickly increases with divergence and reaches its maximum value (100%) at a divergence of only 0.4. Low guanine-cytosine (GC) contents increase the number of nonhomologous bands in AFLP data sets and lead to less reliable trees. However, the effect of the lack of band homology on tree accuracy is surprisingly small relative to the negative impact due to the low information content of AFLP characters. Tree-building methods based on genetic distance displayed similar trends and outperformed parsimony at low but not at high divergences. However, the impact of using alternative phylogenetic methods on tree accuracy was generally small relative to the uncertainty arising from factors such as divergence, nonhomology of bands, or the low information content of AFLP characters. Nevertheless, our data suggest that under certain circumstances, AFLPs may be suitable to reconstruct deeper phylogenies than usually accepted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.