When analyzed separately, data from small studies provide only limited information with limited clinical generalizability, due to small sample size, differing assessments, and limited scope. In this methodological paper we outline a theoretical framework for performing meta-analysis of data obtained from disparate studies using disparate tests, based on calibration of the data from such studies and tests into a unified probability scale. We apply this method to combine the data from five studies examining the diagnostic abilities of different assessments of Attention Deficit/Hyperactivity Disorder (ADHD), including behavioral rating scales and EEG assessments. The studies enrolled a total of 111 subjects, 56 ADHD and 55 controls. Each individual study had a small sample focused on a specific age/gender group, for example 8 boys ages 6-10, and generally had insufficient power to detect statistically significant differences. No gender, or age comparisons were possible within any single study. However, when calibrated and combined, the data resulted in a clear separation between ADHD versus non-ADHD groups in males below the age of 16 (p < 0.001), males above the age of 16, (p = 0.015), females below the age of 16, (p = 0.0014), and females above the age of 16, (p = 0.0022). We conclude that if data from various studies using various tests are made comparable, the resulting combined sample size and the increased diversity of the combined sample lead to increased significance of the statistical tests and allow for cross-sectional comparisons, which are not possible within each individual study.