Motivation Current state-of-the-art tools for the de novo annotation of genes in eukaryotic genomes have to be specifically fitted for each species and still often produce annotations that can be improved much further. The fundamental algorithmic architecture for these tools has remained largely unchanged for about two decades, limiting learning capabilities. Here, we set out to improve the cross-species annotation of genes from DNA sequence alone with the help of deep learning. The goal is to eliminate the dependency on a closely related gene model while also improving the predictive quality in general with a fundamentally new architecture. Results We present Helixer, a framework for the development and usage of a cross-species deep learning model that improves significantly on performance and generalizability when compared to more traditional methods. We evaluate our approach by building a single vertebrate model for the base-wise annotation of 186 animal genomes and a separate land plant model for 51 plant genomes. Our predictions are shown to be much less sensitive to the length of the genome than those of a current state-of-the-art tool. We also present two novel post-processing techniques that each worked to further strengthen our annotations and show in-depth results of an RNA-Seq based comparison of our predictions. Our method does not yet produce comprehensive gene models but rather outputs base pair wise probabilities. Availability The source code of this work is available at https://github.com/weberlab-hhu/Helixer under the GNU General Public License v3.0. The trained models are available at https://doi.org/10.5281/zenodo.3974409 Supplementary information Supplementary data are available at Bioinformatics online.
Background Silver-Russell syndrome (SRS) is an imprinting disorder which is characterised by severe primordial growth retardation, relative macrocephaly and a typical facial gestalt. The clinical heterogeneity of SRS is reflected by a broad spectrum of molecular changes with hypomethylation in 11p15 and maternal uniparental disomy of chromosome 7 (upd(7)mat) as the most frequent findings. Monogenetic causes are rare, but a clinical overlap with numerous other disorders has been reported. However, a comprehensive overview on the contribution of mutations in differential diagnostic genes to phenotypes reminiscent to SRS is missing due to the lack of appropriate tests. With the implementation of next generation sequencing (NGS) tools this limitation can now be circumvented. Main body We analysed 75 patients referred for molecular testing for SRS by a NGS-based multigene panel, whole exome sequencing (WES), and trio-based WES. In 21/75 patients a disease-causing variant could be identified among them variants in known SRS genes (IGF2, PLAG1, HMGA2). Several patients carried variants in genes which have not yet been considered as differential diagnoses of SRS. Conclusions WES approaches significantly increase the diagnostic yield in patients referred for SRS testing. Several of the identified monogenetic disorders have a major impact on clinical management and genetic counseling.
De novo pathogenic variants in CNOT3 have recently been reported in a developmental delay disorder (intellectual developmental disorder with speech delay, autism, and dysmorphic facies [IDDSADF, OMIM: #618672]). The patients present with a variable degree of developmental delay and behavioral problems. To date, all reported disease-causing variants occurred de novo and no parent-child transmission was observed. We report for the first time autosomal dominant transmissions of the CNOT3-associated developmental disorder in two unrelated families. The clinical characteristics in our patients match the IDDSADF features reported so far and suggest substantial variability of the phenotype within the same family.
Background: Heterozygous gain-of-function variants in SAMD9L are associated with ataxia-pancytopenia syndrome (ATXPC) and monosomy 7 myelodysplasia and leukemia syndrome-1 (M7MLS1). Association with peripheral neuropathy has rarely been described. Methods: Whole-exome sequencing (WES) from DNA extracted from peripheral blood was performed in a 10-year-old female presenting with demyelinating neuropathy, her similarly affected mother and the unaffected maternal grandparents. In addition to evaluation of single nucleotide variants, thorough work-up of copy number and exome-wide variant allele frequency data was performed. Results: Combined analysis of the mother’s and daughter’s duo-exome data and analysis of the mother’s and her parents’ trio-exome data initially failed to detect a disease-associated variant. More detailed analysis revealed a copy number neutral loss of heterozygosity of 7q in the mother and led to reanalysis of the exome data for respective sequence variants. Here, a previously reported likely pathogenic variant in the SAMD9L gene on chromosome 7q (NM_152703.5:c.2956C>T; p.(Arg986Cys)) was identified that was not detected with standard filter settings because of a low percentage in blood cells (13%). The variant also showed up in the daughter at 32%, a proportion well below the expected 50%, which in each case can be explained by clonal selection processes in the blood due to this SAMD9L variant. Conclusion: The report highlights the specific pitfalls of molecular genetic analysis of SAMD9L and, furthermore, shows that gain-of-function variants in this gene can lead to a clinical picture associated with the leading symptom of peripheral neuropathy. Due to clonal hematopoietic selection, displacement of the mutant allele occurred, making diagnosis difficult.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.