Autism spectrum disorder (ASD) is a childhood developmental disorder characterized by impairments in communication and social interactions along with the presence of repetitive and perseverant behaviors, and is prevalent in between 0.75% and 1.68% of the pediatric population. 1-3 ASD has been considered a well-known comorbidity in infants with an epileptic encephalopathy, such as infantile spasms; however, the bidirectional and temporal relationship between childhood epilepsy and ASD, as well as causal mechanisms, are still not fully understood. 4 In
Genome sequencing has identified a large number of putative autism spectrum disorder (ASD) risk genes, revealing possible disrupted biological pathways; however, the genetic and environmental underpinnings of ASD remain mostly unanswered. The presented methodology aimed to identify genetically related clusters of ASD individuals. By using the VariCarta dataset, which contains data retrieved from 13,069 people with ASD, we compared patients pairwise to build “patient similarity matrices”. Hierarchical-agglomerative-clustering and heatmapping were performed, followed by enrichment analysis (EA). We analyzed whole-genome sequencing retrieved from 2062 individuals, and isolated 11,609 genetic variants shared by at least two people. The analysis yielded three clusters, composed, respectively, by 574 (27.8%), 507 (24.6%), and 650 (31.5%) individuals. Overall, 4187 variants (36.1%) were common to the three clusters. The EA revealed that the biological processes related to the shared genetic variants were mainly involved in neuron projection guidance and morphogenesis, cell junctions, synapse assembly, and in observational, imitative, and vocal learning. The study highlighted genetic networks, which were more frequent in a sample of people with ASD, compared to the overall population. We suggest that itemizing not only single variants, but also gene networks, might support ASD etiopathology research. Future work on larger databases will have to ascertain the reproducibility of this methodology.
Autism spectrum disorder (ASD) is a heterogeneous condition, characterized by complex genetic architectures and intertwined genetic/environmental interactions. Novel analysis approaches to disentangle its pathophysiology by computing large amounts of data are needed. We present an advanced machine learning technique, based on a clustering analysis on genotypical/phenotypical embedding spaces, to identify biological processes that might act as pathophysiological substrates for ASD. This technique was applied to the VariCarta database, which contained 187,794 variant events retrieved from 15,189 individuals with ASD. Nine clusters of ASD-related genes were identified. The 3 largest clusters included 68.6% of all individuals, consisting of 1455 (38.0%), 841 (21.9%), and 336 (8.7%) persons, respectively. Enrichment analysis was applied to isolate clinically relevant ASD-associated biological processes. Two of the identified clusters were characterized by individuals with an increased presence of variants linked to biological processes and cellular components, such as axon growth and guidance, synaptic membrane components, or transmission. The study also suggested other clusters with possible genotype–phenotype associations. Innovative methodologies, including machine learning, can improve our understanding of the underlying biological processes and gene variant networks that undergo the etiology and pathogenic mechanisms of ASD. Future work to ascertain the reproducibility of the presented methodology is warranted.
Background: Developments in gene-hunting techniques identified several ASD associated genes. The considerable significance of cluster analysis associated with gene network studies has led to reveal many disrupted key pathways in ASD, even if its genetic underpinnings remain a challenging task. This study aims to determine, through a novel data-driven approach, how networks of mutated genes impact biological processes underlying autism. Methods: We analyzed the VariCarta dataset, which presents more than 200,000 genomic variant events collected from 13,069 people with ASD. Firstly, we created a whole-genome and an exome sequencing subset. Then, for each subset we compared pairwise patients of each group to build “patient similarity matrices”. Hierarchical-agglomerative-clustering and heatmap were performed to identify clusters of patients with common occurrences of gene networks within these matrices. The subsequent enrichment analysis (EA) highlighted biological processes that might be impacted by the mutated genes of each subgroup. Results: Considering the whole-genome matrix, we identified three main genetic clusters of ASD patients, each one characterized by a network of shared genetic variants. We isolated 11,609 genetic variants shared by at least two subjects in each cluster; 4,187 of these variants (36.1%) were common to the three clusters. Only 331 patients (2.5%) shared none or very few mutated genes with anyone else. The EA highlighted common or cluster-specific biological processes related to the variants. Most of the common abnormal processes were involved in neuron projections guidance and morphogenesis, cell junctions and synapse assembly. Exome sequencing alone was not effectual in identifying ASD subgroups. Limitations: Caution is warranted when interpreting our results, as we did not compare them with a control group and did not verify if the identified subgroups where actually associated with different phenotypes. Future work will have to ascertain the strength and reproducibility of these results. Conclusions: Itemizing not just single mutated genes, but also gene networks and specific biological processes that characterize different ASD subpopulations might allow to better understand which networks of genetic variants play a major role in the etiopathology of ASD. The proposed methodology may represent a novel approach to help disentangle ASD complexity and an instrument to boost more focused genotype-phenotype studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.