Genomes at the species level are dynamic, with genes present in every individual (core) and genes in a subset of individuals (dispensable) that collectively constitute the pan-genome. Using transcriptome sequencing of seedling RNA from 503 maize (Zea mays) inbred lines to characterize the maize pan-genome, we identified 8681 representative transcript assemblies (RTAs) with 16.4% expressed in all lines and 82.7% expressed in subsets of the lines. Interestingly, with linkage disequilibrium mapping, 76.7% of the RTAs with at least one single nucleotide polymorphism (SNP) could be mapped to a single genetic position, distributed primarily throughout the nonpericentromeric portion of the genome. Stepwise iterative clustering of RTAs suggests, within the context of the genotypes used in this study, that the maize genome is restricted and further sampling of seedling RNA within this germplasm base will result in minimal discovery. Genome-wide association studies based on SNPs and transcript abundance in the pan-genome revealed loci associated with the timing of the juvenile-to-adult vegetative and vegetative-to-reproductive developmental transitions, two traits important for fitness and adaptation. This study revealed the dynamic nature of the maize pan-genome and demonstrated that a substantial portion of variation may lie outside the single reference genome for a species.
Background: Transforming large amounts of genomic data into valuable knowledge for predicting complex traits has been an important challenge for animal and plant breeders. Prediction of complex traits has not escaped the current excitement on machine-learning, including interest in deep learning algorithms such as multilayer perceptrons (MLP) and convolutional neural networks (CNN). The aim of this study was to compare the predictive performance of two deep learning methods (MLP and CNN), two ensemble learning methods [random forests (RF) and gradient boosting (GB)], and two parametric methods [genomic best linear unbiased prediction (GBLUP) and Bayes B] using real and simulated datasets. Methods: The real dataset consisted of 11,790 Holstein bulls with sire conception rate (SCR) records and genotyped for 58k single nucleotide polymorphisms (SNPs). To support the evaluation of deep learning methods, various simulation studies were conducted using the observed genotype data as template, assuming a heritability of 0.30 with either additive or non-additive gene effects, and two different numbers of quantitative trait nucleotides (100 and 1000). Results: In the bull dataset, the best predictive correlation was obtained with GB (0.36), followed by Bayes B (0.34), GBLUP (0.33), RF (0.32), CNN (0.29) and MLP (0.26). The same trend was observed when using mean squared error of prediction. The simulation indicated that when gene action was purely additive, parametric methods outperformed other methods. When the gene action was a combination of additive, dominance and of two-locus epistasis, the best predictive ability was obtained with gradient boosting, and the superiority of deep learning over the parametric methods depended on the number of loci controlling the trait and on sample size. In fact, with a large dataset including 80k individuals, the predictive performance of deep learning methods was similar or slightly better than that of parametric methods for traits with non-additive gene action. Conclusions: For prediction of traits with non-additive gene action, gradient boosting was a robust method. Deep learning approaches were not better for genomic prediction unless non-additive variance was sizable.
Two retrospective studies examining data of 7,500 lactating cows from a single herd were performed with the objective of evaluating the long-term effects of clinical disease during the early postpartum period on milk production, reproduction, and culling of dairy cows through 305 days in milk (DIM). In the first study, data regarding health, milk production, reproduction, and culling of 5,085 cows were summarized. Cows were classified according to incidence of clinical problem (metritis, mastitis, lameness, digestive problem, or respiratory problem) during the first 21 DIM (ClinD21). During 305 d of lactation, cows that had ClinD21 produced, on average, 410 kg less milk, 17 kg less fat, and 12 kg less protein compared with cows that did not have ClinD21 (NoClinD21). Although the interval to first breeding was not different between groups of interest, pregnancy rate through 305 DIM was lower in cows that had ClinD21 [adjusted hazard ratio (AHR) = 0.81]. When individual breedings were analyzed, cows that had ClinD21 presented lower rates of pregnancy per breeding for breedings performed before 150 DIM, reduced numbers of calving per breeding for breedings performed before 200 DIM, and greater number of pregnancy losses for all breedings performed through 305 DIM. The rate of culling from calving through 305 DIM was higher in cows that had a single ClinD21 (AHR = 1.79) and in cows that had multiple ClinD21 (AHR = 3.06), which resulted in a greater proportion of cows leaving the herd by 305 DIM (NoClinD21 = 22.6%; single ClinD21 = 35.7%; multiple ClinD21 = 53.8%). In the second study, data regarding postpartum health and 305-d yields of milk, fat, and protein were collected from 2,415 primiparous cows that had genomic testing information. Genomic estimated breeding values (EBV) were used to predict 305-d yields of milk, fat, and protein. Genomic EBV and predicted yields of milk, fat, and protein did not differ between cows that had ClinD21 and those that did not have ClinD21. In contrast, the observed 305-d yields of milk, fat, and protein were reduced by 345, 10, and 10 kg, respectively, in cows that had ClinD21 compared with cows that did not have ClinD21. We conclude that clinical disease diagnosed and treated during the first 21 DIM has long-term effects on lactation performance, reproduction, and culling of dairy cows, which contribute to detrimental consequences of health problems on sustainability of dairy herds. Replication of our studies in multiple herds will be important to confirm our findings in a larger population.
BackgroundFertility is considered an important economic trait in dairy cattle. Most studies have investigated cow fertility while bull fertility has received much less consideration. The main objective of this study was to perform a comprehensive genomic analysis in order to unravel the genomic architecture underlying sire fertility in Holstein dairy cattle. The analysis included the application of alternative genome-wide association mapping approaches and the subsequent use of diverse gene set enrichment tools.ResultsThe association analyses identified at least eight genomic regions strongly associated with bull fertility. Most of these regions harbor genes, such as KAT8, CKB, TDRD9 and IGF1R, with functions related to sperm biology, including sperm development, motility and sperm-egg interaction. Moreover, the gene set analyses revealed many significant functional terms, including fertilization, sperm motility, calcium channel regulation, and SNARE proteins. Most of these terms are directly implicated in sperm physiology and male fertility.ConclusionsThis study contributes to the identification of genetic variants and biological processes underlying sire fertility. These findings can provide opportunities for improving bull fertility via marker-assisted selection.Electronic supplementary materialThe online version of this article (doi:10.1186/s12863-016-0454-6) contains supplementary material, which is available to authorized users.
BackgroundA valuable tool for both research and industry, in vitro fertilization (IVF) has applications range from gamete selection and preservation of traits to cloning. Although IVF has achieved worldwide use, with approximately 339,685 bovine embryos transferred in 2010 alone, there are still continuing difficulties with efficiency. It is rare to have more than 40% of fertilized in vitro cattle oocytes reach blastocyst stage by day 8 of culture, and pregnancy rates are reported as less than 45% for in vitro produced embryos. To investigate potential influences in-vitro fertilization (IVF) has on embryonic development, this study compares in vivo- and in vitro-derived bovine blastocysts at a similar stage and quality grade (expanded, excellent quality) to determine the degree of transcriptomic variation beyond morphology using RNA-Seq.ResultsA total of 26,906,451 and 38,184,547 fragments were sequenced for in vitro and in vivo embryo pools, respectively. We detected expression for a total of 17,634 genes, with 793 genes showing differential expression between the two embryo populations with false discovery rate (FDR) < 0.05. There were also 395 novel transcribed units found, of which 45 were differentially expressed (FDR < 0.05). In addition, 4,800 genes showed evidence of alternative splicing, with 873 genes displaying differential alternative splicing between the two pools (FDR < 0.05). Using GO enrichment analysis, multiple biological pathways were found to be significantly enriched for differentially expressed genes (FDR < 0.01), including cholesterol and sterol synthesis, system development, and cell differentiation.ConclusionsThus, our results support that IVF may influence at the transcriptomic level and that morphology is limited in full characterization of bovine preimplantation embryos.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.