12Sepsis is defined as a life-threatening organ dysfunction arising from a dysregulated host response to 13 infection. The last two decades have seen significant progress in our understanding of the host component 14 of sepsis. However, detailed study of the composition of the microbial community present in sepsis, or 15 indeed, the finer characterisation of this community as a predictive tool to identify sepsis cases, has only 16 received limited attention. The microbial component of sepsis has been attributed to a heterogenous array 17 of pathogens, usually identified through targeted culture. Metagenomic sequencing of microbial cell-free 18 DNA (cfDNA) offers opportunities to instead classify the full spectrum of microorganisms in a clinical 19 sample. In this study, we statistically characterise the microbial component of sepsis using microbial 20 taxonomic assignments from metagenomes generated from blood samples collected in septic and healthy 21 patients. Using gradient-boosted tree classifiers, we demonstrate the remarkable performance of microbial 22 abundances alone to identify patients with sepsis (AUROC = 0.999; 95% CI 0.997-1.000). Additionally, 23 we demonstrate the promising application of Shapley Additive exPlanations (SHAP), a recently developed 24 model interpretation approach, to reduce sepsis into a set of 23 genera sufficient for diagnosis (AUROC 25 = 0.959; 95% CI 0.903-0.991) and to aid in determining which pathogens contribute most to each 26 diagnosis. While the prevailing clinical diagnostic paradigms seek to identify single causative agents, we 27 instead find evidence for a polymicrobial signature of sepsis using unsupervised clustering, gradient-28 boosted tree classifiers and microbial network analysis. We anticipate that this polymicrobial signature 29 can provide insights into the nature of microbial infection during sepsis and yield predictions which can 30 ultimately be leveraged as a clinically useful tool. In light of these results, we provide some 31 recommendations for future work involving metagenomic sequencing of blood samples. 32 3 Author summary 33 Sepsis is a leading cause of death globally, responsible for an estimated six million deaths every year. 34 Current definitions and research efforts have predominately focused on understanding the host's response 35 to sepsis and developing faster diagnostics, predictive tools and treatments. The role of the resident and 36 pathogenic microbial community in contributing to disease status is more poorly characterised. The advent 37 of large scale metagenomic sequencing of clinical samples offers new opportunities to characterise the 38 species contributing to systemic infections, and unlike culture-based methods is not limited to organisms 39that are fast-growing or culturable. We analysed a publicly available metagenomic dataset, comparing the 40 patterns of microbial DNA in the blood plasma of septic patients relative to that of healthy individuals.
41Our results provide evidence that septic infections tend to be ...