Wood-rotting fungi play an important role in the global carbon cycle because they are the only known organisms that digest wood, the largest carbon stock in nature. In the present study, we used linear discriminant analysis and random forest (RF) machine learning algorithms to predict white- or brown-rot decay modes from the numbers of genes encoding Carbohydrate-Active enZymes with over 98% accuracy. Unlike other algorithms, RF identified specific genes involved in cellulose and lignin degradation, including auxiliary activities (AAs) family 9 lytic polysaccharide monooxygenases, glycoside hydrolase family 7 cellobiohydrolases, and AA family 2 peroxidases, as critical factors. This study sheds light on the complex interplay between genetic information and decay modes and underscores the potential of RF for comparative genomics studies of wood-rotting fungi.
IMPORTANCE
Wood-rotting fungi are categorized as either white- or brown-rot modes based on the coloration of decomposed wood. The process of classification can be influenced by human biases. The random forest machine learning algorithm effectively distinguishes between white- and brown-rot fungi based on the presence of Carbohydrate-Active enZyme genes. These findings not only aid in the classification of wood-rotting fungi but also facilitate the identification of the enzymes responsible for degrading woody biomass.