Objective
To develop a multivariate method for quantifying the population
representativeness across related clinical studies and a computational
method for identifying and characterizing underrepresented subgroups in
clinical studies.
Methods
We extended a published metric named Generalizability Index for Study
Traits (GIST) to include multiple study traits for quantifying the
population representativeness of a set of related studies by assuming the
independence and equal importance among all study traits. On this basis, we
compared the effectiveness of GIST and multivariate GIST (mGIST)
qualitatively. We further developed an algorithm called
“Multivariate Underrepresented Subgroup Identification”
(MAGIC) for constructing optimal combinations of distinct value intervals of
multiple traits to define underrepresented subgroups in a set of related
studies. Using Type 2 diabetes mellitus (T2DM) as an example, we identified
and extracted frequently used quantitative eligibility criteria variables in
a set of clinical studies. We profiled the T2DM target population using the
National Health and Nutrition Examination Survey (NHANES) data.
Results
According to the mGIST scores for four example variables, i.e., age,
HbA1c, BMI, and gender, the included observational T2DM studies had superior
population representativeness than the interventional T2DM studies. For the
interventional T2DM studies, Phase I trials had better population
representativeness than Phase III trials. People at least 65 years old with
HbA1c value between 5.7% and 7.2% were particularly
underrepresented in the included T2DM trials. These results confirmed
well-known knowledge and demonstrated the effectiveness of our methods in
population representativeness assessment.
Conclusions
mGIST is effective at quantifying population representativeness of
related clinical studies using multiple numeric study traits. MAGIC
identifies underrepresented subgroups in clinical studies. Both data-driven
methods can be used to improve the transparency of design bias in
participation selection at the research community level.