The aim of the study was to use RNA sequencing (RNA-seq) data of lung from chronic obstructive pulmonary disease (COPD) patients to identify the bacteria that are most commonly detected. Additionally, the study sought to investigate the differences in these infections between normal lung tissues and those affected by COPD. Patients and Methods: We re-analyzed RNA-seq data of lung from 99 COPD patients and 93 non-COPD smokers to determine the extent to which the metagenomes differed between the two groups and to assess the reliability of the metagenomes. We used unmapped reads in the RNA-seq data that were not aligned to the human reference genome to identify more common infections in COPD patients. Results: We identified 18 bacteria that exhibited significant differences between the COPD and non-COPD smoker groups. Among these, Yersinia enterocolitica was found to be more than 30% more abundant in COPD. Additionally, we observed difference in detection rate based on smoking history. To ensure the accuracy of our findings and distinguish them from false positives, we doublecheck the metagenomic profile using Basic Local Alignment Search Tool (BLAST). We were able to identify and remove specific species that might have been misclassified as other species in Kraken2 but were actually Staphylococcus aureus, as identified by BLAST analysis.
Conclusion:This study highlighted the method of using unmapped reads, which were not typically used in sequencing data, to identify microorganisms present in patients with lung diseases such as COPD. This method expanded our understanding of the microbial landscape in COPD and provided insights into the potential role of microorganisms in disease development and progression.