To achieve the goal of providing the best possible care to each individual under their care, physicians need to customize treatments for individuals with the same health state, especially when treating diseases that can progress further and require additional treatments, such as cancer. Making decisions at multiple stages as a disease progresses can be formalized as a dynamic treatment regime (DTR). Most of the existing optimization approaches for estimating dynamic treatment regimes including the popular method of Q-learning were developed in a frequentist context. Recently, a general Bayesian machine learning framework that facilitates using Bayesian regression modeling to optimize DTRs has been proposed. In this article, we adapt this approach to censored outcomes using Bayesian additive regression trees (BART) for each stage under the accelerated failure time modeling framework, along with simulation studies and a real data example that compare the proposed approach with Q-learning. We also develop an R wrapper function that utilizes a standard BART survival model to optimize DTRs for censored outcomes. The wrapper function can easily be extended to accommodate any type of Bayesian machine learning model.