Purpose
We sought to develop machine learning models to detect multileaf collimator (MLC) modeling errors with the use of radiomic features of fluence maps measured in patient‐specific quality assurance (QA) for intensity‐modulated radiation therapy (IMRT) with an electric portal imaging device (EPID).
Methods
Fluence maps measured with EPID for 38 beams from 19 clinical IMRT plans were assessed. Plans with various degrees of error in MLC modeling parameters [i.e., MLC transmission factor (TF) and dosimetric leaf gap (DLG)] and plans with an MLC positional error for comparison were created. For a total of 152 error plans for each type of error, we calculated fluence difference maps for each beam by subtracting the calculated maps from the measured maps. A total of 837 radiomic features were extracted from each fluence difference map, and we determined the number of features used for the training dataset in the machine learning models by using random forest regression. Machine learning models using the five typical algorithms [decision tree, k‐nearest neighbor (kNN), support vector machine (SVM), logistic regression, and random forest] for binary classification between the error‐free plan and the plan with the corresponding error for each type of error were developed. We used part of the total dataset to perform fourfold cross‐validation to tune the models, and we used the remaining test dataset to evaluate the performance of the developed models. A gamma analysis was also performed between the measured and calculated fluence maps with the criteria of 3%/2 and 2%/2 mm for all of the types of error.
Results
The radiomic features and its optimal number were similar for the models for the TF and the DLG error detection, which was different from the MLC positional error. The highest sensitivity was obtained as 0.913 for the TF error with SVM and logistic regression, 0.978 for the DLG error with kNN and SVM, and 1.000 for the MLC positional error with kNN, SVM, and random forest. The highest specificity was obtained as 1.000 for the TF error with a decision tree, SVM, and logistic regression, 1.000 for the DLG error with a decision tree, logistic regression, and random forest, and 0.909 for the MLC positional error with a decision tree and logistic regression. The gamma analysis showed the poorest performance in which sensitivities were 0.737 for the TF error and the DLG error and 0.882 for the MLC positional error for 3%/2 mm. The addition of another type of error to fluence maps significantly reduced the sensitivity for the TF and the DLG error, whereas no effect was observed for the MLC positional error detection.
Conclusions
Compared to the conventional gamma analysis, the radiomics‐based machine learning models showed higher sensitivity and specificity in detecting a single type of the MLC modeling error and the MLC positional error. Although the developed models need further improvement for detecting multiple types of error, radiomics‐based IMRT QA was shown to be a promising approach for detecting the MLC modeli...