BackgroundCone beam computed tomography (CBCT) is often employed on radiation therapy treatment devices (linear accelerators) used in image‐guided radiation therapy (IGRT). For each treatment session, it is necessary to obtain the image of the day in order to accurately position the patient and to enable adaptive treatment capabilities including auto‐segmentation and dose calculation. Reconstructed CBCT images often suffer from artifacts, in particular those induced by patient motion. Deep‐learning based approaches promise ways to mitigate such artifacts.PurposeWe propose a novel deep‐learning based approach with the goal to reduce motion induced artifacts in CBCT images and improve image quality. It is based on supervised learning and includes neural network architectures employed as pre‐ and/or post‐processing steps during CBCT reconstruction.MethodsOur approach is based on deep convolutional neural networks which complement the standard CBCT reconstruction, which is performed either with the analytical Feldkamp‐Davis‐Kress (FDK) method, or with an iterative algebraic reconstruction technique (SART‐TV). The neural networks, which are based on refined U‐net architectures, are trained end‐to‐end in a supervised learning setup. Labeled training data are obtained by means of a motion simulation, which uses the two extreme phases of 4D CT scans, their deformation vector fields, as well as time‐dependent amplitude signals as input. The trained networks are validated against ground truth using quantitative metrics, as well as by using real patient CBCT scans for a qualitative evaluation by clinical experts.ResultsThe presented novel approach is able to generalize to unseen data and yields significant reductions in motion induced artifacts as well as improvements in image quality compared with existing state‐of‐the‐art CBCT reconstruction algorithms (up to +6.3 dB and +0.19 improvements in peak signal‐to‐noise ratio, PSNR, and structural similarity index measure, SSIM, respectively), as evidenced by validation with an unseen test dataset, and confirmed by a clinical evaluation on real patient scans (up to 74% preference for motion artifact reduction over standard reconstruction).ConclusionsFor the first time, it is demonstrated, also by means of clinical evaluation, that inserting deep neural networks as pre‐ and post‐processing plugins in the existing 3D CBCT reconstruction and trained end‐to‐end yield significant improvements in image quality and reduction of motion artifacts.