2023
DOI: 10.1055/a-2023-9181
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the Impact of Health Care Data Completeness for Deep Generative Models

Abstract: Background: Deep generative models (DGMs) present a promising avenue for generating realistic, synthetic data to augment existing healthcare datasets. However, exactly how the completeness of the original dataset affects the quality of the generated synthetic data is unclear. Objectives: In this paper, we investigate the effect of data completeness on samples generated by the most common DGM paradigms. Methods: We create both cross-sectional and panel datasets with varying missingness and subset rates and trai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 50 publications
0
4
0
Order By: Relevance
“…11 Tahar et al decline the use of the term “consistency” for their study and refer to plausibility instead, broadly defined as “deviations from expected values.” 14 Smith et al talk about “consistency and similarity of synthetic samples” without precisely mentioning their interpretation of the dimension. 16 Beyond the different interpretations of terms that are directly related to the topic data quality, one might imagine similar deviations regarding basic concepts such as “data set” or “metadata.” Tahar et al looked at those terms and propose respective definitions. 14 Nevertheless, the manifold perspectives and interpretations of domain-specific and basic terms will impede the distribution of the concepts, frameworks, packages, and tools presented by the papers to a wider application.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…11 Tahar et al decline the use of the term “consistency” for their study and refer to plausibility instead, broadly defined as “deviations from expected values.” 14 Smith et al talk about “consistency and similarity of synthetic samples” without precisely mentioning their interpretation of the dimension. 16 Beyond the different interpretations of terms that are directly related to the topic data quality, one might imagine similar deviations regarding basic concepts such as “data set” or “metadata.” Tahar et al looked at those terms and propose respective definitions. 14 Nevertheless, the manifold perspectives and interpretations of domain-specific and basic terms will impede the distribution of the concepts, frameworks, packages, and tools presented by the papers to a wider application.…”
Section: Discussionmentioning
confidence: 99%
“…14 The two remaining papers picked up the approach of synthetic data collections. 15,16 Concepts Related to Data Quality Mashoufi et al present a scoping review about the main concepts related to data quality and data quality assessment methodologies. 9 In their search for these concepts, they focused on health care using the search term "medical record" and considered both paper-based and computerized records.…”
Section: Selection Processmentioning
confidence: 99%
See 2 more Smart Citations