Purpose To investigate deep reinforcement learning (DRL) based on historical treatment plans for developing automated radiation adaptation protocols for non-small cell lung cancer (NSCLC) patients that aim to maximize tumor local control at reduced rates of radiation pneumonitis grade 2 (RP2). Methods In a retrospective population of 114 NSCLC patients who received radiotherapy, a 3-component neural networks framework was developed for deep reinforcement learning (DRL) of dose fractionation adaptation. Large-scale patient characteristics included clinical, genetic, and imaging radiomics features in addition to tumor and lung dosimetric variables. First, a generative adversarial network (GAN) was employed to learn patient population characteristics necessary for DRL training from a relatively limited sample size. Second, a radiotherapy artificial environment (RAE) was reconstructed by a deep neural network (DNN) utilizing both original and synthetic data (by GAN) to estimate the transition probabilities for adaptation of personalized radiotherapy patients’ treatment courses. Third, a deep Q-network (DQN) was applied to the RAE for choosing the optimal dose in a response-adapted treatment setting. This multi-component reinforcement learning approach was benchmarked against real clinical decisions that were applied in an adaptive dose escalation clinical protocol. In which, 34 patients were treated based on avid PET signal in the tumor and constrained by a 17.2% normal tissue complication probability (NTCP) limit for RP2. The uncomplicated cure probability (P+) was used as a baseline reward function in the DRL. Results Taking our adaptive dose escalation protocol as a blueprint for the proposed DRL (GAN+RAE+DQN) architecture, we obtained an automated dose adaptation estimate for use at ~ 2/3 of the way into the radiotherapy treatment course. By letting the DQN component freely control the estimated adaptive dose per fraction (ranging from 1 ~ 5 Gy), the DRL automatically favored dose escalation/de-escalation between 1.5 ~ 3.8 Gy, a range similar to that used in the clinical protocol. The same DQN yielded two patterns of dose escalation for the 34 test patients, but with different reward variants. First, using the baseline P+ reward function, individual adaptive fraction doses of the DQN had similar tendencies to the clinical data with an RMSE= 0.76 Gy; but adaptations suggested by the DQN were generally lower in magnitude (less aggressive). Second, by adjusting the P+ reward function with higher emphasis on mitigating local failure, better matching of doses between the DQN and the clinical protocol was achieved with an RMSE= 0.5 Gy. Moreover, the decisions selected by the DQN seemed to have better concordance with patients eventual outcomes. In comparison, the traditional temporal difference (TD) algorithm for reinforcement learning yielded an RMSE= 3.3 Gy due to numerical instabilities and lack of sufficient learning. Conclusion We demonstrated that automated dose adaptation by DRL is a feasible and a promi...
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
In real-world environments, noisy utterances with variable noise levels are recorded and then converted to i-vectors for cosine distance or PLDA scoring. This paper investigates the effect of noise-level variability on i-vectors. It demonstrates that noise-level variability causes the i-vectors to shift, causing the noise contaminated i-vectors to form clusters in the ivector space. It also demonstrates that optimal subspaces for discriminating speakers are noise-level dependent. Based on these observations, this paper proposes using signal-to-noise ratio (SNR) of utterances as guidance for training mixture of PLDA models. To maximize the coordination among the PLDA models, mixtures of PLDA models are trained simultaneously via an EM algorithm using the utterances contaminated with noise at various levels. For scoring, given a test i-vector, the marginal likelihoods from individual PLDA models are linearly combined by the posterior probabilities of the test utterance's SNR. Verification scores are the ratio of the marginal likelihoods. Results based on NIST 2012 SRE suggest that the SNR-dependent mixture of PLDA is not only suitable for the situations where the test utterances exhibit a wide range of SNR, but also beneficial for the test utterances with unknown SNR distribution. Supplementary materials containing full derivations of the EM algorithms and scoring functions can be found in http://bioinfo.eie.polyu.edu.hk/mPLDA/SuppMaterials.pdf.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.