Tristimulus values and chromaticities, which are derived using the color matching functions (CMFs), are commonly used for color characterization, calibration, and specifications, with the stimuli having the same values believed to have the same color appearance (i.e., metameric match). Many studies, however, found that the stimuli having the same tristimulus values do not appear the same (known as metameric failure) due to the failure of CMFs in accurately characterizing the color matching mechanisms. Most past work investigated the performance of different CMFs through color matching experiments, with a smaller chromaticity or calculated color difference between the reference and test stimuli suggesting a better performance. Such differences, however, may not accurately characterize the performance of the CMFs, since the color spaces or chromaticity diagrams may not be uniform, in terms of the threshold of noticeable color difference. In this study, the human observers evaluated the perceived color difference between pairs of stimuli, which were calibrated to have the same tristimulus values calculated using the CIE 1931 2° CMFs, with two sizes of field of view (FOV). The results clearly suggested that the stimuli having the same tristimulus values may not always appear the same, which depended on the primaries and the FOV. More importantly, the results clearly suggested that the chromaticity differences from color matching experiments may overestimate the metameric failure, especially when the color differences were small.