14Tolerance to stress conditions is vital for organismal survival, including bacteria under specific environmental 15 conditions, antibiotics and other perturbations. Some studies have described common modulation and shared 16 genes during stress response to different types of disturbances (termed as perturbome), leading to the idea 17 of a central control at the molecular level. We implemented a robust machine learning approach to identify 18 and describe genes associated with multiple perturbations or perturbome in a Pseudomonas aeruginosa PAO1 19 model.
20Using public transcriptomic data, we evaluated six approaches to rank and select genes: using two 21 methodologies, data single partition (SP method) or multiple partitions (MP method) for training and testing 22 datasets, we evaluated three classification algorithms (SVM Support Vector Machine, KNN K-Nearest neighbor 23 and RF Random Forest). Gene expression patterns and topological features at systems level were include to 24 describe the perturbome elements.
25We were able to select and describe 46 core response genes associated to multiple perturbations in 26 Pseudomonas aeruginosa PAO1 and it can be considered a first report of the P. aeruginosa perturbome.
27Molecular annotations, patterns in expression levels and topological features in molecular networks revealed 28 biological functions of biosynthesis, binding and metabolism, many of them related to DNA damage repair 29 and aerobic respiration in the context of tolerance to stress. We also discuss different issues related to 30 implemented and assessed algorithms, including normalization analysis, data partitioning, classification 31 approaches and metrics. Altogether, this work offers a different and robust framework to select genes using 32 a machine learning approach. 33 34