Background Procedures for the detection of signatures of selection can be classified according to the source of information they use to reject the null hypothesis of absence of selection. Three main groups of tests can be identified that are based on: (1) the analysis of the site frequency spectrum, (2) the study of the extension of the linkage disequilibrium across the length of the haplotypes that surround the polymorphism, and (3) the differentiation among populations. The aim of this study was to compare the performance of a subset of these procedures by using a dataset on seven Spanish autochthonous beef cattle populations.ResultsAnalysis of the correlations between the logarithms of the statistics that were obtained by 11 tests for detecting signatures of selection at each single nucleotide polymorphism confirmed that they can be clustered into the three main groups mentioned above. A factor analysis summarized the results of the 11 tests into three canonical axes that were each associated with one of the three groups. Moreover, the signatures of selection identified with the first and second groups of tests were shared across populations, whereas those with the third group were more breed-specific. Nevertheless, an enrichment analysis identified the metabolic pathways that were associated with each group; they coincided with canonical axes and were related to immune response, muscle development, protein biosynthesis, skin and pigmentation, glucose metabolism, fat metabolism, embryogenesis and morphology, heart and uterine metabolism, regulation of the hypothalamic–pituitary–thyroid axis, hormonal, cellular cycle, cell signaling and extracellular receptors.ConclusionsWe show that the results of the procedures used to identify signals of selection differed substantially between the three groups of tests. However, they can be classified using a factor analysis. Moreover, each canonical factor that coincided with a group of tests identified different signals of selection, which could be attributed to processes of selection that occurred at different evolutionary times. Nevertheless, the metabolic pathways that were associated with each group of tests were similar, which suggests that the selection events that occurred during the evolutionary history of the populations probably affected the same group of traits.Electronic supplementary materialThe online version of this article (doi:10.1186/s12711-016-0258-1) contains supplementary material, which is available to authorized users.
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix (T−1) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible.
Linkage disequilibrium (LD) and persistence of phase are fundamental approaches for exploring the genetic basis of economically important traits in cattle, including the identification of QTL for genomic selection and the estimation of effective population size () to determine the size of the training populations. In this study, we have used the Illumina BovineHD chip in 168 trios of 7 Spanish beef cattle breeds to obtain an overview of the magnitude of LD and the persistence of LD phase through the physical distance between markers. Also, we estimated the time of divergence based on the persistence of the LD phase and calculated past from LD estimates using different alternatives to define the recombination rate. Estimates of average (as a measure of LD) for adjacent markers were close to 0.52 in the 7 breeds and decreased with the distance between markers, although in long distances, some LD still remained (0.07 and 0.05 for markers 200 kb and 1 Mb apart, respectively). A panel with a lower boundary of 38,000 SNP would be necessary to launch a successful within-breed genomic selection program. Persistence of phase, measured as the pairwise correlations between estimates of in 2 breeds at short distances (10 kb), was in the 0.89 to 0.94 range and decreased from 0.33 to 0.52 to a range of 0.01 to 0.08 when marker distance increased from 200 kb to 1 Mb, respectively. The magnitude of the persistence of phase between the Spanish beef breeds was similar to those found in dairy breeds. For across-breed genomic selection, the size of the SNP panels must be in the range of 50,000 to 83,000 SNP. Estimates of past showed values ranging from 26 to 31 for 1 generation ago in all breeds. The divergence among breeds occurred between 129 and 207 generations ago. The results of this study are relevant for the future implementation of within- and across-breed genomic selection programs in the Spanish beef cattle populations. Our results suggest that a reduced subset of the SNP panel would be enough to achieve an adequate precision of the genomic predictions.
In organisms with sexual reproduction, genetic diversity, and genome evolution are governed by meiotic recombination caused by crossing-over, which is known to vary within the genome. In this study, we propose a simple method to estimate the recombination rate that makes use of the persistency of linkage disequilibrium (LD) phase among closely related populations. The biological material comprised 171 triplets (sire/dam/offspring) from seven populations of autochthonous beef cattle in Spain (Asturiana de los Valles, Avileña-Negra Ibérica, Bruna dels Pirineus, Morucha, Pirenaica, Retinta, and Rubia Gallega), which were genotyped for 777,962 SNPs with the BovineHD BeadChip. After standard quality filtering, we reconstructed the haplotype phases in the parental individuals and calculated the LD by the correlation -r- between each pair of markers that had a genetic distance < 1 Mb. Subsequently, these correlations were used to calculate the persistency of LD phase between each pair of populations along the autosomal genome. Therefore, the distribution of the recombination rate along the genome can be inferred since the effect of the number of generations of divergence should be equivalent throughout the genome. In our study, the recombination rate was highest in the largest chromosomes and at the distal portion of the chromosomes. In addition, the persistency of LD phase was highly heterogeneous throughout the genome, with a ratio of 25.4 times between the estimates of the recombination rates from the genomic regions that had the highest (BTA18-7.1 Mb) and the lowest (BTA12-42.4 Mb) estimates. Finally, an overrepresentation enrichment analysis (ORA) showed differences in the enriched gene ontology (GO) terms between the genes located in the genomic regions with estimates of the recombination rate over (or below) the 95th (or 5th) percentile throughout the autosomal genome.
The Spanish local beef cattle breeds have most likely common origin followed by a process of differentiation. This particular historical evolution has most probably left detectable signatures in the genome. The objective of this study was to identify genomic regions associated with differentiation processes in seven Spanish autochthonous populations (Asturiana de los Valles (AV), Avileña-Negra Ibérica (ANI), Bruna dels Pirineus (BP), Morucha (Mo), Pirenaica (Pi), Retinta (Re) and Rubia Gallega (RG)). The BovineHD 777K BeadChip was used on 342 individuals (AV, n=50; ANI, n=48; BP, n=50; Mo, n=50; Pi, n=48; Re, n=48; RG, n=48) chosen to be as unrelated as possible. We calculated the fixation index (F ST ) and performed a Bayesian analysis named SelEstim. The output of both procedures was very similar, although the Bayesian analysis provided a richer inference and allowed us to calculate significance thresholds by generating a pseudo-observed data set from the estimated posterior distributions. We identified a very large number of genomic regions, but when a very restrictive significance threshold was applied these regions were reduced to only 10. Among them, four regions can be highlighted because they comprised a large number of single nucleotide polymorphisms and showed extremely high signals (Kullback-Leiber divergence (KLD)>6). They are located in BTA 2 (5 575 950 to 10 152 228 base pairs (bp)), BTA 5 (17 596 734 to 18 850 702 bp), BTA 6 (37 853 912 to 39 441 548 bp) and BTA 18 (13 345 515 to 15 243 838 bp) and harbor, among others, the MSTN (Myostatin), KIT-LG (KIT Ligand), LAP3 (leucine aminopeptidase 3), NAPCG (non-SMC condensing I complex, subunit G), LCORL (ligand dependent nuclear receptor corepressor-like) and MC1R (Melanocortin 1 receptor) genes. Knowledge on these genomic regions allows to identify potential targets of recent selection and helps to define potential candidate genes associated with traits of interest, such as coat color, muscle development, fertility, growth, carcass and immunological response.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.