Despite significant scholarly attention, the literature on the existence and direction of gender differences in creativity has produced inconsistent findings. In the present paper, we argue that this lack of consensus may be attributable, at least in part, to gender-specific inconsistencies in the measurement of creative problem-solving. To explore this possibility, we empirically tested assumptions of multiple-group measurement invariance using samples borrowed from four recent studies that assessed creative problem-solving (J.D. Barrett et al., 2013;. Across the four samples, apparent gender differences emerged on all three components of S.P. Besemer & K. O'Quin's (1999) three-facet model of creativity (i.e., quality, originality, and elegance) such that, on average, females appeared to exhibit higher baseline levels of creativity. However, in light of violations of measurement invariance assumptions across genders found in these samples, comparisons such as these may not ultimately be appropriate. Although the underlying factor structure and factor loadings on a unitary creativity factor were consistent across gender (i.e., weak factorial invariance), measurement in-equivalence assumptions were violated at the subfacet level (i.e., strong factorial invariance). Implications of these findings for understanding gender differences in creative problem-solving are discussed.