The SILICOFCM platform is an in-silico cloud computing platform which utilizes advanced computational workflows for drug development and optimized clinical therapy in the domain of hypertrophic cardiomyopathy (HCM). The current study presents the SILICOFCM's virtual population model (VPM) which can be used to generate high-quality virtual clinical data using both multivariate and machine learning methods along with virtual geometries for in-silico clinical trials. The proposed VPM workflow includes data quality management functionalities for outlier detection and similarity detection which are used to enhance the quality of the real patient data. In addition, the virtual clinical data generator which is part of the VPM includes both multivariate methods, such as, the multivariate normal distribution and machine learning methods, such as, the tree ensembles, the artificial neural networks, and the Bayesian networks. The VPM was utilized in a use-case scenario which included 592 records of patients with HCM towards the generation of clinical data for 1000 virtual patients. Our results suggest that the VPM was able to yield virtual distributions with an increased convergence with the real distributions, where the average goodness of fit was 0.038, the Kullback-Leibler (KL) divergence was 0.029 and the absolute correlation difference 0.0443 between the real and the virtual correlation matrices along with virtual geometries that mimic the real ones.
To develop a computationally efficient and unbiased synthetic data generator for large-scale in silico clinical trials (CTs). Methods: We propose the BGMM-OCE, an extension of the conventional BGMM (Bayesian Gaussian Mixture Models) algorithm to provide unbiased estimations regarding the optimal number of Gaussian components and yield high-quality, largescale synthetic data at reduced computational complexity. Spectral clustering with efficient eigenvalue decomposition is applied to estimate the hyperparameters of the generator. A case study is conducted to compare the performance of BGMM-OCE against four straightforward synthetic data generators for in silico CTs in hypertrophic cardiomyopathy (HCM). Results: The BGMM-OCE generated 30000 virtual patient profiles having the lowest coefficient-of-variation (0.046), inter-and intra-correlation differences (0.017, and 0.016, respectively) with the real ones in reduced execution time. Conclusions: BGMM-OCE overcomes the lack of population size in HCM which obscures the development of targeted therapies and robust risk stratification models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.