BackgroundAutism spectrum disorder (ASD) is a multifactorial neurodevelopmental disorder. Major interplays between the gastrointestinal (GI) tract and the central nervous system (CNS) seem to be driven by gut microbiota (GM). Herein, we provide a GM functional characterization, based on GM metabolomics, mapping of bacterial biochemical pathways, and anamnestic, clinical, and nutritional patient metadata.MethodsFecal samples collected from children with ASD and neurotypical children were analyzed by gas-chromatography mass spectrometry coupled with solid phase microextraction (GC–MS/SPME) to determine volatile organic compounds (VOCs) associated with the metataxonomic approach by 16S rRNA gene sequencing. Multivariate and univariate statistical analyses assessed differential VOC profiles and relationships with ASD anamnestic and clinical features for biomarker discovery. Multiple web-based and machine learning (ML) models identified metabolic predictors of disease and network analyses correlated GM ecological and metabolic patterns.ResultsThe GM core volatilome for all ASD patients was characterized by a high concentration of 1-pentanol, 1-butanol, phenyl ethyl alcohol; benzeneacetaldehyde, octadecanal, tetradecanal; methyl isobutyl ketone, 2-hexanone, acetone; acetic, propanoic, 3-methyl-butanoic and 2-methyl-propanoic acids; indole and skatole; and o-cymene. Patients were stratified based on age, GI symptoms, and ASD severity symptoms. Disease risk prediction allowed us to associate butanoic acid with subjects older than 5 years, indole with the absence of GI symptoms and low disease severity, propanoic acid with the ASD risk group, and p-cymene with ASD symptoms, all based on the predictive CBCL-EXT scale. The HistGradientBoostingClassifier model classified ASD patients vs. CTRLs by an accuracy of 89%, based on methyl isobutyl ketone, benzeneacetaldehyde, phenyl ethyl alcohol, ethanol, butanoic acid, octadecane, acetic acid, skatole, and tetradecanal features. LogisticRegression models corroborated methyl isobutyl ketone, benzeneacetaldehyde, phenyl ethyl alcohol, skatole, and acetic acid as ASD predictors.ConclusionOur results will aid the development of advanced clinical decision support systems (CDSSs), assisted by ML models, for advanced ASD-personalized medicine, based on omics data integrated into electronic health/medical records. Furthermore, new ASD screening strategies based on GM-related predictors could be used to improve ASD risk assessment by uncovering novel ASD onset and risk predictors.