Background and purpose: Access to healthcare data is indispensable for scientific progress and innovation. Sharing healthcare data is time-consuming and notoriously difficult due to privacy and regulatory concerns. The Personal Health Train (PHT) provides a privacy-by-design infrastructure connecting FAIR (Findable, Accessible, Interoperable, Reusable) data sources and allows distributed data analysis and machine learning. Patient data never leaves a healthcare institute. Materials and methods: Lung cancer patient-specific databases (tumor staging and post-treatment survival information) of oncology departments were translated according to a FAIR data model and stored locally in a graph database. Software was installed locally to enable deployment of distributed machine learning algorithms via a central server. Algorithms (MATLAB, code and documentation publicly available) are patient privacy-preserving as only summary statistics and regression coefficients are exchanged with the central server. A logistic regression model to predict post-treatment two-year survival was trained and evaluated by receiver operating characteristic curves (ROC), root mean square prediction error (RMSE) and calibration plots. Results: In 4 months, we connected databases with 23 203 patient cases across 8 healthcare institutes in 5 countries (Amsterdam, Cardiff, Maastricht, Manchester, Nijmegen, Rome, Rotterdam, Shanghai) using the PHT. Summary statistics were computed across databases. A distributed logistic regression model predicting post-treatment two-year survival was trained on 14 810 patients treated between 1978 and 2011 and validated on 8 393 patients treated between 2012 and 2015.
Conclusion:The PHT infrastructure demonstrably overcomes patient privacy barriers to healthcare data sharing and enables fast data analyses across multiple institutes from different countries with different regulatory regimens. This infrastructure promotes global evidence-based medicine while prioritizing patient privacy.
We present a method to implement probabilistic treatment planning of intensity-modulated radiation therapy using custom software plugins in a commercial treatment planning system. Our method avoids the definition of safety-margins by directly including the effect of geometrical uncertainties during optimization when objective functions are evaluated. Because the shape of the resulting dose distribution implicitly defines the robustness of the plan, the optimizer has much more flexibility than with a margin-based approach. We expect that this added flexibility helps to automatically strike a better balance between target coverage and dose reduction for surrounding healthy tissue, especially for cases where the planning target volume overlaps organs at risk. Prostate cancer treatment planning was chosen to develop our method, including a novel technique to include rotational uncertainties. Based on population statistics, translations and rotations are simulated independently following a marker-based IGRT correction strategy. The effects of random and systematic errors are incorporated by first blurring and then shifting the dose distribution with respect to the clinical target volume. For simplicity and efficiency, dose-shift invariance and a rigid-body approximation are assumed. Three prostate cases were replanned using our probabilistic objective functions. To compare clinical and probabilistic plans, an evaluation tool was used that explicitly incorporates geometric uncertainties using Monte-Carlo methods. The new plans achieved similar or better dose distributions than the original clinical plans in terms of expected target coverage and rectum wall sparing. Plan optimization times were only about a factor of two higher than in the original clinical system. In conclusion, we have developed a practical planning tool that enables margin-less probability-based treatment planning with acceptable planning times, achieving the first system that is feasible for clinical implementation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.