Purpose
In the literature on automated phenotyping of chronic obstructive pulmonary disease (COPD), there is a multitude of isolated classical machine learning and deep learning techniques, mostly investigating individual phenotypes, with small study cohorts and heterogeneous meta‐parameters, e.g., different scan protocols or segmented regions. The objective is to compare the impact of different experimental setups, i.e., varying meta‐parameters related to image formation and data representation, with the impact of the learning technique for subtyping automation for a variety of phenotypes. The identified associations of these parameters with automation performance and their interactions might be a first step towards a determination of optimal meta‐parameters, i.e., a meta‐strategy.
Methods
A clinical cohort of 981 patients (53.8 ± 15.1 years, 554 male) was examined. The inspiratory CT images were analyzed to automate the diagnosis of 13 COPD phenotypes given by two radiologists. A benchmark feature set that integrates many quantitative criteria was extracted from the lung and trained a variety of learning algorithms on the first 654 patients (two thirds) and the respective algorithm retrospectively assessed the remaining 327 patients (one third). The automation performance was evaluated by the area under the receiver operating characteristic curve (AUC). 1717 experiments were conducted with varying meta‐parameters such as reconstruction kernel, segmented regions and input dimensionality, i.e., number of extracted features. The association of the meta‐parameters with the automation performance was analyzed by multivariable general linear model decomposition of the automation performance in the contributions of meta‐parameters and the learning technique.
Results
The automation performance varied strongly for varying meta‐parameters. For emphysema‐predominant phenotypes, an AUC of 93%–95% could be achieved for the best meta‐configuration. The airways‐predominant phenotypes led to a lower performance of 65%–85%, while smooth kernel configurations on average were unexpectedly superior to those with sharp kernels. The performance impact of meta‐parameters, even that of often neglected ones like the missing‐data imputation, was in general larger than that of the learning technique. Advanced learning techniques like 3D deep learning or automated machine learning yielded inferior automation performance for non‐optimal meta‐configurations in comparison to simple techniques with suitable meta‐configurations. The best automation performance was achieved by a combination of modern learning techniques and a suitable meta‐configuration.
Conclusions
Our results indicate that for COPD phenotype automation, study design parameters such as reconstruction kernel and the model input dimensionality should be adapted to the learning technique and may be more important than the technique itself. To achieve optimal automation and prediction results, the interaction between input those meta‐parameters and the learning technique ...