Background Nematode model organisms such as Caenorhabditis elegans and Pristionchus pacificus are powerful systems for studying the evolution of gene function at a mechanistic level. However, the identification of P. pacificus orthologs of candidate genes known from C. elegans is complicated by the discrepancy in the quality of gene annotations, a common problem in nematode and invertebrate genomics. Results Here, we combine comparative genomic screens for suspicious gene models with community-based curation to further improve the quality of gene annotations in P. pacificus. We extend previous curations of one-to-one orthologs to larger gene families and also orphan genes. Cross-species comparisons of protein lengths, screens for atypical domain combinations and species-specific orphan genes resulted in 4311 candidate genes that were subject to community-based curation. Corrections for 2946 gene models were implemented in a new version of the P. pacificus gene annotations. The new set of gene annotations contains 28,896 genes and has a single copy ortholog completeness level of 97.6%. Conclusions Our work demonstrates the effectiveness of comparative genomic screens to identify suspicious gene models and the scalability of community-based approaches to improve the quality of thousands of gene models. Similar community-based approaches can help to improve the quality of gene annotations in other invertebrate species, including parasitic nematodes.
Nematodes such as Caenorhabditis elegans are powerful systems to study basically all aspects of biology. Their species richness together with tremendous genetic knowledge from C. elegans facilitate the evolutionary study of biological functions using reverse genetics. However, the ability to identify orthologs of candidate genes in other species can be hampered by erroneous gene annotations. To improve gene annotation in the nematode model organism Pristionchus pacificus, we performed a genome-wide screen for C. elegans genes with potentially incorrectly annotated P. pacificus orthologs. We initiated a community-based project to manually inspect more than two thousand candidate loci and to propose new gene models based on recently generated Iso-seq and RNA-seq data. In most cases, misannotation of C. elegans orthologs was due to artificially fused gene predictions and completely missing gene models. The community-based curation raised the gene count from 25,517 to 28,036 and increased the single copy ortholog completeness level from 86% to 97%. This pilot study demonstrates how even small-scale crowdsourcing can drastically improve gene annotations. In future, similar approaches can be used for other species, gene sets, and even larger communities thus making manual annotation of large parts of the genome feasible.
Whole genome duplication (WGD) is often considered a major driver of evolution that leads to phenotypic novelties. However, the importance of WGD for evolution is still controversial because most documented WGD events occurred anciently and few experimental systems amenable to genetic analysis are available. Here, we report a recent WGD event in the hermaphroditic nematode Allodiplogaster sudhausi and present a comparison with a gonochoristic (male/female) sister species that did not undergo WGD. Self-fertilizing reproduction of A. sudhausi make it amenable to functional analysis and an ideal system to study WGD events. We document WGD in A. sudhausi through karyotype analysis and whole genome sequencing, the latter of which allowed us to i) identify functional bias in retention of protein domains and metabolic pathways, ii) show most duplicate genes are under evolutionary constraint, iii) show a link between sequence and expression divergence and iv) characterize differentially expressed duplicates. We additionally show the WGD is associated with increased body size and an abundance of repeat elements (36% of the genome), including a recent expansion of the DNA-hAT/Ac transposon family. Finally, we demonstrate the use of CRISPR/Cas9 to generate mutant knock-outs, whereby two WGD-derived duplicate genes display functional redundancy in that they both need to be knocked out to generate a phenotype. Together, we present a novel experimental system that is convenient for examining and characterizing WGD-derived genes both computationally and functionally.
Background Repetitive sequences and mobile elements make up considerable fractions of individual genomes. While transposition events can be detrimental for organismal fitness, repetitive sequences form an enormous reservoir for molecular innovation. In this study, we aim to add repetitive elements to the annotation of the Pristionchus pacificus genome and assess their impact on novel gene formation. Results Different computational approaches define up to 24% of the P. pacificus genome as repetitive sequences. While retroelements are more frequently found at the chromosome arms, DNA transposons are distributed more evenly. We found multiple DNA transposons, as well as LTR and LINE elements with abundant evidence of expression as single-exon transcripts. When testing whether transposons disproportionately contribute towards new gene formation, we found that roughly 10–20% of genes across all age classes overlap transposable elements with the strongest trend being an enrichment of low complexity regions among the oldest genes. Finally, we characterized a horizontal gene transfer of Zisupton elements into diplogastrid nematodes. These DNA transposons invaded nematodes from eukaryotic donor species and experienced a recent burst of activity in the P. pacificus lineage. Conclusions The comprehensive annotation of repetitive elements in the P. pacificus genome builds a resource for future functional genomic analyses as well as for more detailed investigations of molecular innovations.
Adaptation of organisms to environmental change may be facilitated by the creation of new genes. New genes without homologs in other lineages are known as taxonomically-restricted orphan genes and may result from divergence or de novo formation. Previously, we have extensively characterized the evolution and origin of such orphan genes in the nematode model organism Pristionchus pacificus. Here, we employ large-scale transcriptomics to establish potential functional associations and to measure the degree of transcriptional plasticity among orphan genes. Specifically, we analyzed 24 RNA-seq samples from adult P. pacificus worms raised on 24 different monoxenic bacterial cultures. Based on coexpression analysis, we identified 28 large modules that harbor 3,727 diplogastrid-specific orphan genes and that respond dynamically to different bacteria. These coexpression modules have distinct regulatory architecture and also exhibit differential expression patterns across development suggesting a link between bacterial response networks and development. Phylostratigraphy revealed a considerably high number of family- and even species-specific orphan genes in certain coexpression modules. This suggests that new genes are not attached randomly to existing cellular networks and that integration can happen very fast. Integrative analysis of protein domains, gene expression and ortholog data facilitated the assignments of biological labels for 22 coexpression modules with one of the largest, fast-evolving module being associated with spermatogenesis. In summary, this work presents the first functional annotation for thousands of P. pacificus orphan genes and reveals insights into their integration into environmentally responsive gene networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.