Purpose
Patient‐specific quality assurance (QA) for intensity‐modulated radiation therapy (IMRT) is a ubiquitous clinical procedure, but conventional methods have often been criticized as being insensitive to errors or less effective than other common physics checks. Recently, there has been interest in the application of radiomics, quantitative extraction of image features, to radiotherapy QA. In this work, we investigate a deep learning approach to classify the presence or absence of introduced radiotherapy treatment delivery errors from patient‐specific QA.
Methods
Planar dose maps from 186 IMRT beams from 23 IMRT plans were evaluated. Each plan was transferred to a cylindrical phantom CT geometry. Three sets of planar doses were exported from each plan corresponding to (a) the error‐free case, (b) a random multileaf collimator (MLC) error case, and (c) a systematic MLC error case. Each plan was delivered to the electronic portal imaging device (EPID), and planned and measured doses were used to calculate gamma images in an EPID dosimetry software package (for a total of 558 gamma images). Two radiomic approaches were used. In the first, a convolutional neural network with triplet learning was used to extract image features from the gamma images. In the second, a handcrafted approach using texture features was used. The resulting metrics from both approaches were input into four machine learning classifiers (support vector machines, multilayer perceptrons, decision trees, and k‐nearest‐neighbors) in order to determine whether images contained the introduced errors. Two experiments were considered: the two‐class experiment classified images as error‐free or containing any MLC error, and the three‐class experiment classified images as error‐free, containing a random MLC error, or containing a systematic MLC error. Additionally, threshold‐based passing criteria were calculated for comparison.
Results
In total, 303 gamma images were used for model training and 255 images were used for model testing. The highest classification accuracy was achieved with the deep learning approach, with a maximum accuracy of 77.3% in the two‐class experiment and 64.3% in the three‐class experiment. The performance of the handcrafted approach with texture features was lower, with a maximum accuracy of 66.3% in the two‐class experiment and 53.7% in the three‐class experiment. Variability between the results of the four machine learning classifiers was lower for the deep learning approach vs the texture feature approach. Both radiomic approaches were superior to threshold‐based passing criteria.
Conclusions
Deep learning with convolutional neural networks can be used to classify the presence or absence of introduced radiotherapy treatment delivery errors from patient‐specific gamma images. The performance of the deep learning network was superior to a handcrafted approach with texture features, and both radiomic approaches were better than threshold‐based passing criteria. The results suggest that radiomic QA is a promising direction f...