2022
DOI: 10.1136/bmjmed-2022-000167
|View full text |Cite
|
Sign up to set email alerts
|

Synthetic data in medical research

Abstract: ⇒ Synthetic data are artificial data that can be used to support efficient medical and healthcare research, while minimising the need to access personal data ⇒ More research is needed to determine the extent to which synthetic data can be relied on for formal analysis, the cost effectiveness of generating synthetic data, and how to accurately assess disclosure risk Synthetic data have the potential to improve medical research while minimising the need to access personal data; Theodora Kokosi and Katie Harron e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(10 citation statements)
references
References 16 publications
0
10
0
Order By: Relevance
“…These data can perpetuate and/or accentuate biases underlying the original data used to create the data generation model, might lack interpretability due to the black-box nature of underlying algorithms leading to lack of trust in using for real applications, might reveal confidential information in an adversarial attack, and lack consensus on evaluation of data quality. 60,61,72 These challenges are being carefully addressed by development of regulatory policies around using these data for improving patient outcomes. 60 Regulatory considerations (with a focus on fit-for-purpose risk-based framework) for the use of AI/ML for precision medicine…”
Section: Data Sharing Distributed Learning and Synthetic Data Generationmentioning
confidence: 99%
“…These data can perpetuate and/or accentuate biases underlying the original data used to create the data generation model, might lack interpretability due to the black-box nature of underlying algorithms leading to lack of trust in using for real applications, might reveal confidential information in an adversarial attack, and lack consensus on evaluation of data quality. 60,61,72 These challenges are being carefully addressed by development of regulatory policies around using these data for improving patient outcomes. 60 Regulatory considerations (with a focus on fit-for-purpose risk-based framework) for the use of AI/ML for precision medicine…”
Section: Data Sharing Distributed Learning and Synthetic Data Generationmentioning
confidence: 99%
“…Consequently, the use of synthetic data has gained popularity as a potential approach to enhance research reproducibility and implement differential privacy for protected health information [51]. The goal of data synthesis is to create a dataset that closely resembles the original individuallevel data and retrain prediction models [52]. Synthesized data can expedite methodological advancements in medical research and assist in processing high-dimensional and challenging medical data.…”
Section: Challenges and Future Directions For ML Applicationmentioning
confidence: 99%
“…Real-world evidence, 33,39 is shaped into an engineered 'ground truth', artificially augmented and synthesized 60,61 to simulate temporal context and longitude 62 -conditions necessary to measure performance against desired patient outcomes. This approach is not sustainable, as it is both resource-intensive and fails the tests of explainability and reproducibility.…”
Section: Data Value Predicament In Health Innovationmentioning
confidence: 99%